Under consideration for publication in Theory and Practice of Logic Programming 



Checking Modes of HAL Programs* 

MARIA GARCIA DE LA BANDA, WARWICK HARVEY, KIM MARRIOTT 

School of Computer Science & Software Engineering, Monash University, Australia 
(e-mail: {mbanda,marriott}Scsse .monash. edu. au) (e-mail: whaicparc.ic.ac.uk) 

PETER J. STUCKEY 

Department of Computer Science & Software Engineering, University of Melbourne, Australia 

(e-mail: pjs@cs.mu.oz.au) 

BART DEMOEN 

Department of Computer Science, Catholic University Leuven, Belgium 
(e-mail: bmd9cs.kuleuven.ac.be) 

submitted 6 December 2002; revised 16 March 2004; accepted 15 September 2004 



Note: This article is to published in Theory and Practice of Logic Programming. 
©Cambridge University Press. 

Abstract 

Recent constraint logic programming (CLP) languages, such as HAL and Mercury, re- 
quire type, mode and determinism declarations for predicates. This information allows 
the generation of efficient target code and the detection of many errors at compile-time. 
Unfortunately, mode checking in such languages is difficult. One of the main reasons is 
that, for each predicate mode declaration, the compiler is required to appropriately re- 
order literals in the predicate's definition. The task is further complicated by the need 
to handle complex instantiations (which interact with type declarations and higher-order 
predicates) and automatic initialization of solver variables. Here we define mode check- 
ing for strongly typed CLP languages which require reordering of clause body literals. 
In addition, we show how to handle a simple case of polymorphic modes by using the 
corresponding polymorphic types. 

KEYWORDS: Strong modes, mode checking, regular grammars 



1 Introduction 

While traditional logic and constraint logic programming (CLP) languages are un- 
typed and unmoded, recent languages such as Mercury ( |Somogyi et al. 1996| ) and 
HAL UDemoen et al. 1999bllGarcia de la Banda et al. 2 002) require type, mode and 
determinism declarations for (exported) predicates. This information allows the 
generation of efficient target code (e.g. mode information can provide an order of 
magnitude speed improvement l|Demoen et al. 19 99a)). improves robustness and 



* A preliminary version of this paper appeared under the title "Mode Checking in HAL," in the 
Conference on Computational Logic (CL'2000), London, June 2000. 
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facilitates efficient integration with foreign language procedures. Here we describe 
our experience with mode checking in the HAL compiler. 

HAL is a CLP language designed to facilitate "plug-and-play" experimentation 
with different solvers. To achieve this it provides support for user-defined constraint 
solvers, global variables and dynamic scheduling. Mode checking in HAL is one of 
the most complex stages in the compilation. Since predicates can be given multiple 
mode declarations, mode checking is performed for each of these modes and the 
compiler creates a specialized procedure for each mode (i.e. it performs multi- variant 
specialization). Mode checking involves traversing each predicate mode declaration 
to check that if the predicate is called with the input instantiation specified by the 
mode declaration then the following two properties are satisfied. First, the predicate 
mode declaration is input-output correct, that is, it is guaranteed that if the input 
instantiation satisfies this declaration then the result is an output instantiation 
that satisfies the declaration. And second, the predicate is call correct, that is, if 
the input instantiation satisfies this declaration then each literal occurring in the 
definition of the predicate is called with an input instantiation satisfying one of its 
declared modes. 

Call correctness may require the compiler to re-order literals in the body of 
each rule, so that literals are indeed called with an appropriate input instantiation. 
Such reordering is essential in logic programming languages which wish to support 
multi-moded predicates while, at the same time, retaining a Prolog programming 
style in which a single predicate definition is provided for all modes of usage. And 
an important function of this reordering is to appropriately order the equalities 
inserted by the compiler during program normalisation for matching/constructing 
non- variable predicate arguments. The need to reorder rule bodies is one reason 
why mode checking is a rather complex task. However, it is not the only reason. 
Three other issues exacerbate the difficulty of mode checking. First, instantiations 
(which describe the possible states of program variables) may be very complex and 
interact with the type declarations. Second, accurate mode checking of higher-order 
predicates is difficult. Third, the compiler needs to handle automatic initialization 
of solver variables. 

Although mode inference and checking of logic programs has been a fertile re- 
search field for many years, almost all research has focused on mode checking/inference 
in traditional (and thus untyped) logic programming languages where the analysis 
assumes the given literal ordering is fixed and cannot assume that a program is type 
correct. Thus, a main contribution of this paper is a complete definition of mode 
checking in the context of CLP languages which are strongly typed and which may 
require reordering of rule body literals during mode checking. 

A second contribution of the paper is to describe the algorithms for mode checking 
currently employed in the HAL compiler. Since HAL and the logic programming lan- 
guage Mercury share similar type and mode systems, 1 much of our description and 
formalization also applies to mode checking in Mercury (which has not been previ- 

1 In part, because HAL is compiled to Mercury. 
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ously described 2 ). However, there are significant differences between mode checking 
in the two languages. In HAL there is the need to handle automatic initialization of 
solver variables, and, in general, complex modes (other than in and out) are used 
more frequently since constraint solver variables are usually not ground. Further- 
more, determining the best reordering in HAL is more complex than in Mercury 
because the order in which constraints are solved can have a more significant im- 
pact on efficiency ( |Marriott and Stuckey 1992] ). Also, HAL handles a limited form 
of polymorphic mode checking. On the other hand, Mercury's mode system allows 
the specification of additional information about data structure liveness and usage. 

The rest of the paper is organized as follows. In the following section we review 
related work. Section provides an informal view of the role of types, modes, and 
instantiations in the HAL language. Its aim is to give insight into the more rigorous 
formalization provided in section 01 which introduces type-instantiation grammars 
for combining type and instantiation information as the basis for mode checking 
in HAL. Section 03 describes the basic steps performed for mode checking HAL 
programs. Section El focuses on the automatic initialization needed by the modes 
of usage of some predicates. Section discusses mode checking of higher order 
predicates and objects, while Section 00 shows how to handle simple polymorphic 
modes. Finally, Section provides our conclusions and discusses some future work. 

2 Related Work 

Starting with l|Mellish 1987l|Debray 19891 ) there has been considerable research into 
mode checking and inference in traditional logic programming languages. However, 
as indicated above, there are two fundamental differences between that work and 
ours. 

First, almost all research assumes that mode analysis is not required to reorder 
clause bodies. Second, while almost all research has focused on untyped logic pro- 
gramming languages, mode checking of HAL relies on predicates and program vari- 
ables having a single (parametric polymorphic) Hindley-Milner type and the type 
correctness of the program with respect to this type. Access to type information al- 
lows us to handle more complex instantiations than are usually considered in mode 
analysis and also to handle mode checking of higher-order predicates in a more 
rigorous fashion: in most previous work higher-order predicates are largely ignored. 

Another important difference is that we are dealing with constraint logic pro- 
gramming languages in which program variables need to be appropriately initial- 
ized before being sent to some constraint solver as part of a constraint. Requiring 
explicit initialization of solver variables puts additional burden on the programmer 
and makes it impossible to write multi-moded predicate definitions for which dif- 
ferent modes require different variable initialisations. We have consequently chosen 
for the HAL compiler to automatically initialize solver variables, i.e. the compiler 
generates initialization code whenever necessary. In order to perform such auto- 
matic initialization mode checking in HAL must track which program variables are 



2 Recently a thesis has been completed on Mercury mode checking ( Overton 2003 1 
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currently uninitialized (in our terminology are new). Tracking of uninitialized vari- 
ables also supports powerful optimizations which can greatly improve performance. 
For this reason the Mercury mode checker also tracks uninitialized variables. 

This need to track uninitialized program variables is a significant difference be- 
tween mode checking in the Mercury and HAL languages, and most logic program- 
ming work on modes. It is not the same as tracking so-called "free" variables in 
traditional logic programming: first free variables may be aliased to other variables, 
something that is not possible with uninitialized variables, second, uninitialized 
variables have to be tracked exactly: the compiler must not fail to initialize a vari- 
able, neither should it initialize a variable more than once. We will now review 
selected related work in detail. 

The original work on mode checking in strongly typed logic languages with re- 
orderable clause bodies is that of dSomogyi 1987| ), which gives an informal pre- 
sentation of a mode system based on types. This is perhaps the closest work in 
spirit since it was the basis of mode checking in Mercury. However, its mode sys- 
tem is much simpler than ours and it does not consider higher-order predicates 
or the problems of automatic initialization. The remaining work does not consider 
compile-time reordering. 

Perhaps the most closely related work in traditional logic programming language 
analysis is the early work of (Janss ens and Bruynooghe 1993j which uses regular 
trees to define types and instantiations, and uses these trees to perform mode infer- 
ence. Main differences are that fjansscns and Bruynooghe 1993| ) does not consider 
reordering or tracking uninitialized variables. Other more technical differences are 
that, although we use deterministic tree grammars to formalize types, our type 
analysis ( Dem oen et al. 1999|) is based on a Hindley-Milner approach. A key differ- 
ence with this and other work such as that of ( Bo ye and M amszyris ki lgWI ) is that 
we describe instantiations for polymorphic types, including higher-order objects. 
Also, in ^Janssens and Bruynooghe 1993| ) depth restrictions are imposed to make 
the generated regular trees finite. This is not needed in our approach. Finally, they 
use definite and possible sharing analysis to improve instantiation information. This 
is not done yet in HAL for complexity reasons (sharing analysis is quite expensive 
and thus a danger for practical compilation), however a simple sharing and aliasing 
analysis should indeed prove to be useful. 

After the early work of ^Janssens and Bruynooghe 1993| ), there has been a sig- 
nificant amount of research aimed at improving the precision of the analysis by 
providing additional information about the structure of the terms. Initially, this 
was achieved by performing some simple pattern analysis and then providing this 
information to other analyses (see for example, ( |Charlier and Hentenryck 1994| 
Mulk ers et al. 1995)1. Later, with the gradual success of typed languages, pattern 
information was substituted by type information with which more accurate results 
could be obtained, i.e., type information was annotated with different kinds of infor- 
mation some of which were mode information (see, for example, ( R id oux et al. 19991 
ISmaus et al. 2f)f)f)j> L But most of this work was designed to either provide a general 
framework for combining type information with other kinds of information, or to in- 
fer some particular kind of information (such as mode information) from a program 
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without reordering the literals in the body of predicates. Furthermore, they were 
not interested in tracking uninitialized variables nor keeping enough instantiation 
information (i.e. which particular tree constructors can occur) for optimizations 
such as switch detection ((Henderso n et al. 1996(1 . Again, further differences arise 
since we consider higher-order mode inference and polymorphic modes. 

Recent work on directional types (see e.g. ( |Boye and M aluszyhski 1997)) is much 
more analogous to HAL mode checking. There, they are interested in determin- 
ing mode-correctness of a program given (user supplied) mode descriptions (called 
directional types). Apart from previously mentioned differences, the framework 
of ( |Boye and M aluszy hski 1997| ) uses directional types that are much simpler than 
the instantiations that we deal with here. Interestingly, the work of ( |Boye and Maluszyhski 1997| ) 
uses directional type correctness to show that a run-time reordering of a well-typed 
program will not deadlock, somewhat analogous to our compile-time reordering. 

Type dependency analysis (Codis h*and Lagoon 2000*1 ) is also related to mode 
checking. Their analysis determines type dependencies from which we can read all 
the correct modes or directional types of a program. The framework is however 
restricted to use types (and modes) defined by unary function symbols and an ACI 
operator. 

Other related work has been on mode checking for concurrent logic program- 
ming languages fCodognet et al. 1990| ): There the emphasis has been on detecting 
communication patterns and possible deadlocks. 

The only other logic programming system we are aware of which does signifi- 
cant mode checking is Ciao l|Bueno et al. 2002)l . The Ciao logic programming sys- 
tem l|Bueno et al. 2002}) does mode checking using its general assertion checking 
framework CiaoPP based on abstract interpretation ( |Hermenegildo et al. 2003 ) . 
Modes are considered as simply one form of assertion, and indeed the notion of 
what is a mode is completely redefmable. The default modes are analyzed by the 
CiaoPP preprocessor using a combination of regular type inference and ground- 
ness, freeness and sharing analyses. Ciao modes are more akin to directional types, 
than the strong modes of HAL and Mercury, and the compiler will check them if 
possible, and optionally add run-time tests for modes that could not be checked at 
compile time. As with other earlier work the fundamental differences with the HAL 
mode system are in treatment of uninitialized variables, reordering, higher-order 
and polymorphic modes. 



3 HAL by example 

This section provides an informal view of the role of types, modes, and instantia- 
tions in the HAL language. The aim is to provide insight into the more rigorous 
formalization that will be provided in the following sections. We do this by explain- 
ing the example HAL program shown in Figure^ which implements a polymorphic 
stack using lists. Note that HAL follows the basic CLP syntax, with variables, rules 
and predicates defined as usual (see, for example, ( |Marriott and Stuckey 19981 ) for 
an introduction to CLP). 
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:- typedef list(T) -> ( [] ; [T| list (T)] ) . 



:- instdef elist -> [] . 

:- instdef list(I) -> ( [] ; [I I list (I)] ) . 
:- instdef nelist(I) -> [I|list(I)]. 

:- modedef out (I) -> (new -> I). 
:- modedef in(I) -> (I -> I). 

:- pred push (list (T) ,T, list (T) ) . 

:- mode push (in, in, out (nelist (ground) ) ) is det . 

push(SO,E,Sl) :- SI = [E|S0]. 

:- pred pop(list(T) ,T,list(T)) . 

:- mode pop (in, out , out) is semidet . 

:- mode pop(in(nelist (ground) ), out , out) is det. 

pop(S0,E,Sl) :- SO = [E| SI]. 

:- pred empty (list (T) ) . 

:- mode empty (in) is semidet. 

:- mode empty (out (elist) ) is det. 

empty (S) :- S = [] . 



Fig. 1. Example HAL program implementing a polymorphic stack. 

3.1 Types 

Informally, a ground type describes a set of ground terms and is used as a reason- 
able approximation of the ground values a particular program variable can take. 
It is therefore an invariant over the life time of the variable. Types in HAL are 
prescriptive rather than descriptive, they restrict the possible values of a variable. 
Unlike much of the work performed on types for logic programming languages, our 
types only include the ground (also called fixed) values that a variable can take. 
Later we will describe how instantiations are used to express when a variable takes 
a value which is not completely fixed. 

Types are specified using type definition statements. For instance, in the example 
shown in Figure ^ the line 
:- typedef list(T) -> ( [] ; [T| list(T)] ) . 

defines the polymorphic type constructor list/1 where list(T) is the type of lists 
with elements of parametric type 3 T. These lists are made up using the []/0 and 
. II (represented by [•!•]) tree constructors. 

HAL includes the usual set of built-in basic types: float (floating point numbers), 
int (integers), char (characters) and string (strings). Like most typed languages, 
HAL provides the means to define type equivalences. For example, the statement 

3 In order to clearly distinguish between program variables and any other kinds of variables (type 
variables, instantiation variables, etc) we will refer to all other kinds of variables as parameters 
(i.e., type parameters, instantiation parameters, etc). 
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:- typedef vector = list(int). 

defines the type vector to be a list of integers. Equivalence types are simply macros 
for type expressions, and the compiler replaces equivalence types by their definition 
(circular type equivalences are not allowed). From now on we assume that equiva- 
lence types have been eliminated from the type expressions we consider. This can 
be achieved straightforwardly by applying substitution. 

Finally, HAL allows a type to be declared as hidden so that its definition is not 
visible outside the module in which it is defined. We note that the treatment of 
hidden types is almost identical to that of type parameters and so omit them for 
simplicity. 

It is important to note that a program variable's type is used by a compiler 
to determine the representation format for that variable, i.e., the particular way 
in which program variables are stored during execution. As a result, two program 
variables may have different types even though the representation of their values 
can be identical. For example, in a language providing both the ASCII character 
set and an extended international character set, variables representing each kind of 
character would need to have different types since their internal representation is 
different. 

3.2 Solvers 

In HAL a constraint solver is defined using a new type. Assume for example, that 
a programmer wishes to implement a constraint solver over floating point numbers. 
From the point of view of the user, the variables will take floating point values and 
thus one might expect them to have the built-in type float. But their internal 
representation cannot be a float as they need to keep track of internal information 
for the solver. As a result, the type of the variables cannot be the built-in type float 
but must be some other type defined by the solver, and whose implementation is 
hidden from the outside world. This is were we use abstract types, to hide this view 
from the outside world. 

Example 1 

For example a floating point solver type cf loat might be defined as 
:- typedef cfloat -> var(int) ; val (float) . 

where the integer in the var tree constructor refers to a column number in a (global) 
simplex tableaux, and the val constructor is used to represent simple fixed value 
floating point numbers. □ 

Types defined by solvers are called solver types and variables with a solver type 
are called solver variables. Solvers must also provide an initialization procedure 
(init/1) and at least the equality (=/2) constraint for the type, although many 
other constraints will be usually provided. Note that solver variables must be ini- 
tialized before they can be involved in any constraint. This is required so that the 
solver can keep track of its variables and initialize the appropriate internal data- 
structures for them. 
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The case of Herbrand solver types (i.e., types for which there is a full unification 
solver) is somewhat special. Any user-defined type can be declared to be a Herbrand 
solver type by annotating its type definition with the words "deriving solver" . For 
example: 

:- typedef hlist(T) -> ( [] ; [Tlhlist(T)] ) deriving solver. 

defines the hlist Herbrand type. The compiler will then automatically create an 
initialization predicate for the type (which is actually identical for all Herbrand 
types) and an equality predicate for the type which handles not only simple con- 
struction, deconstruction and assignment of non-variable terms (which is the only 
equality support provided for non-solver types), but full unification. As a result, 
while variables with non-solver type list are always required to be bound at run- 
time to a list of fixed length (so that the limited support provided by construc- 
tion, deconstruction and assignment is enough), 4 variables with type hlist may be 
bound to open ended lists, where the tail of the list is an unbound (list) variable. 

3.3 Instantiations 

Instantiations define the set of values, within a type, that a program variable may 
have at a particular program point in the execution, as well as the possibility that 
the variable (as yet) takes no value. Instantiation information is vital to the com- 
piler to determine whether equations on terms are being used to construct terms, 
deconstruct terms or check the equality of two terms. Furthermore, instantiation 
information is needed to infer the determinism of predicates (i.e., how many answers 
a predicate has) and to perform many other low-level optimizations. 

Although instantiations may seem very similar to types, they should not be con- 
fused: a type is invariant over the life of the variable, while instantiations change. 
Additionally, instantiations reflect the possibility of a variable having no value yet, 
or being "constrained" to some unknown set of values. 

HAL provides three base instantiations for a variable: ground, old and new. A 
variable is ground if it is known to have a unique value; the compiler might not know 
exactly which value within the type (it might depend on the particular execution) , 
but it knows it is fixed (for a solver variable this happens whenever the variable 
cannot be constrained further). 

A variable is new if it has not been initialized and it has never appeared in a 
constraint (thus the name new). Thus, it is known to take no value yet. As we 
have indicated, the instantiation new leads to a crucial difference between mode 
checking in Mercury and HAL, and that investigated in most other research into 
mode checking of logic programs. Mercury and HAL demand that at each point in 
execution the compiler knows whether a variable has a value or not. This allows 
many compiler optimizations, and is a key to the difference in execution speed of 
Mercury and HAL to most other logic programming systems. The requirement to 
always have accurate instantiation information about which variables are new drives 

4 Note that the elements inside the list need not be ground! 
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many of the decisions made in the mode checking system. In particular, it means 
that a new variable is not allowed to appear inside a data structure, and can only 
be given a value by assignment or, if it is a solver variable, after initialization. 

Finally, the instantiation old is used to describe a solver variable that has been 
initialized but for which nothing is known about its possible values. Note that 
the variable might be unconstrained, it might be ground, or anything in between 
(e.g. be greater than 5); the compiler simply does not know. In the case in which 
old is associated with a non-solver variable, it is deemed to be equivalent to ground. 
Note that in Mercury, where there are no solver types, each variable always has an 
instantiation which is cither new or (a subset of) ground. 

It is important to note that new is not analogous to free in the usual logic pro- 
gramming sense. A free variable in the HAL context is an old variable (thus, it 
has been initialized by the appropriate solver) which has never been bound to a 
non- variable term. Thus, free variables might have been aliased, while new variables 
cannot. This is exploited by the compiler by not giving a run-time representation 
to new variables. As a consequence, a new variable cannot occur syntactically more 
than once. 

For data structures such as trees or lists of solver variables, more complex instan- 
tiation states may be used. These instantiations are specified using instantiation 
definition statements which look very much like type definitions, the only differ- 
ence being that the arguments themselves are instantiations rather than types. For 
instance, in the example shown in Figure ^ the lines 

:- instdef elist -> [] . 

:- instdef list(I) -> ( [] ; [I I list (I)] ) . 
:- instdef nelist(I) -> [I Hist (I)]. 

define the instantiation constructors elist/0, list/1 and nelist/1, which in the 
example are associated with variables of type list/1. In that context, the instantia- 
tion elist describes empty lists. The polymorphic instantiation list (I) describes 
lists with elements of parametric instantiation I (note the deliberate reuse of the 
type name). Finally, the instantiation nelist(I) describes non-empty lists with 
elements of parametric instantiation I. 

When associated with a variable, an instantiation requires the variable to be 
bound to one of the outer-most functors in the right-hand-side of its definition, 
and the arguments of the functor to satisfy the instantiation of the corresponding 
arguments in the instantiation definition. In the case of elist, it would mean the 
variable is ground. In the remaining two cases, it would depend on the parametric 
instantiation I, but at the very least the variable would be known to be a nil- 
terminated list, i.e. its length is fixed. 

Note that the separation of instantiation information from type information 
means we can associate the same instantiation for different types. For example, 
a program variable with solver type hlist(int) and instantiation list (ground) 
indicates that the program variable has a fixed length list as its value. A program 
variable with non-solver type list (int) and instantiation list (ground) indicates 
the same, but since the type is not a solver type, this would always be the case. The 
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separation of instantiation information from type information also makes the han- 
dling of polymorphic application much more straightforward, since we will simply 
associate a different type with the same instantiation. 

As mentioned before, the instantiation new is not allowed to appear as an argu- 
ment of any other instantiation. As a result, a variable can only be inserted in a data 
structure if it is either ground or initialized (and thus must be old). The main rea- 
son for this is the requirement for accurate mode information about new variables. 
It quickly becomes very difficult to always have correct instantiation information 
about which variables (and parts of data structures) are new. While sharing and 
aliasing analyses might allow us to keep track which variables are new in more sit- 
uations, inevitably they lead to situations where we cannot determine whether the 
value of a variable is new or not, which is not acceptable to the compiler. We do 
however plan to use sharing and aliasing analysis to keep track of initialized (old) 
variables that have yet to be constrained (analogous to free variables in Prolog). 

3.4 Modes 

A mode is of the form Inst\ — > Znsfe where Inst\ describes the call (or input) 
instantiation and Insti describes the success (or output) instantiation. The base 
modes are mappings from one base instantiation to another: we use two letter 
codes (00, no, og, gg, ng) based on the first letter of the instantiation, e.g. ng is 
new— >ground. The usual modes in and out are also provided (as renamings of gg 
and ng, respectively). 

Modes are specified using mode definition statements. For instance, in the exam- 
ple shown in Figure ^ the lines 

:- modedef out (I) -> (new -> I). 
:- modedef in(I) -> (I -> I). 

are mode definitions, defining macros for modes. The out (I) mode requires a new 
object on call and returns an object with instantiation I. The in (I) mode requires 
instantiation I on call and has the same instantiation on success. 

HAL allows the programmer to define mode equivalences and instantiation equiv- 
alences. As for type equivalences, from now on we assume that these equivalences 
have been eliminated from the program. For example 

:- modedef in = in (ground) . 
: - modedef out = out (ground) . 

define in as equivalent to ground -> ground, and out as new -> ground. 

3.5 Equality 

The equality constraint is a special predicate in HAL. Equality will be normalized 
in HAL programs to take one of two forms x\ = X2, and x = f(x±, . . . , x n ) where 
x, x±, . . . ,x n are variables. Each form of equality supports a number of modes. 

The equality x\ = x 2 can be used in two modes. In the first mode, copy (:=), 
either x\ or x-i must be new and the other variable must not be new. Assuming x\ 
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is new the value of x-i is copied into x\. In the second mode unify (==) both x\ 
and xi must not be new. This requires a full unification. 

The equality x — f(x\, . . . , x n ) can also be used in two modes. In the first mode, 
construct ( : =), x must be new and each of x\, . . . , x n not new. A new / structure 
is built on the heap, and the values of x%, . . . , x n are copied into this structure. In 
the second mode, deconstruct (=:), each of x\,...,x n must be new and x must 
not be new. If x is of the form f(a\, . . . , a n ) then the value of is copied into Xi, 
otherwise the deconstruct fails. 5 As we shall see later, mode checking shall extend 
the use of these modes for other implicit modes. 

3.6 Predicate Declarations 

HAL allows the programmer to declare the type and modes of usage of predicates. 
In our example of Figure ^ the lines 

:- pred pop Clist (T) ,T, list (T)) . 

:- mode pop (in, out , out) is semidet. 

:- mode pop(in(nelist (ground) ), out , out) is det . 

give such declarations for predicate pop/3. The first line is a polymorphic type 
declaration (with parametric type T). It specifies the types of each of the three 
arguments of pop/3. The second and third lines are mode declarations specifying 
the two different modes in which the predicate can be executed. For example, in 
the first mode the first argument is ground on call and success, while the second 
and third arguments are new on call and ground on success. 

Each mode declaration for a predicate defines a procedure, a different way of 
executing the predicate. The role of mode checking is not just to show these modes 
are correct, but also to reorder conjunctions occurring in the predicate definition 
in order to create these procedures. 

The second and third lines also contain a determinism declaration. These describe 
how many answers a predicate may have for a particular mode of usage: nondet 
means any number of solutions; multi at least one solution; semidet at most one 
solution; det exactly one solution; failure no solutions; and erroneous a run-time 
error. Thus, in the second line, since pop/3 for this mode of usage is guaranteed to 
have at most one solution but might fail (when the first argument is an empty list), 
the determinism is semidet. For the second mode, the first argument is not only 
known to be ground but also to be a non-empty list. As a result, the predicate can 
be ensured to have exactly one solution and so its determinism is det. Notice how by 
providing more complex instantiations we can improve the determinism information 
of the predicate. They also lead to more efficient code, since unnecessary checks 
(e.g. that the first argument of pop/3 is bound to . II) are eliminated. 

Currently, HAL requires predicate mode declarations for each predicate and 
checks they are correct. Predicate type declarations, on the other hand, can be 
omitted and, if so, will be inferred by the compiler i|Demoen et al. 1 999 ). 



5 This is a simplistic hig h level view, actually the system uses PARMA bindings and things are 
more complicated. See IDemoen et al. 1999al for details. 
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4 Type, Instantiation and Type-Instantiation Grammars 

In this section we formalize type and instantiation definitions in terms of (ex- 
tended) regular tree grammars. Then we introduce type-instantiation (ti-) gram- 
mars which combine type and instantiation information and are the basis for mode 
checking in HAL. Throughout the section we will use teletype font when referring 
to (fixed) type and instantiation expressions, and sans serif font when referring to 
non-terminals of tree grammars. 

4-1 HAL Programs 

We begin by defining basic terminology and HAL programs. 

A signature E is a set of pairs f/n where / is a function symbol and n > is 
the integer arity of /. A function symbol with arity is called a constant. Given a 
signature E the set of all trees (the Herbrand Universe), denoted r(E), is defined 
as the least set satisfying: 

r(E) = |J {f(h,...,t n ) I {ti,...,t n } C r(E)}. 

//nes 

We assume (for simplicity) that E contains at least one constant symbol (i.e. a 
symbol with arity 0). 

Let V be a set of symbols called variables. The set of all terms over S and V, 
denoted r(S, V), is similarly defined as the least set satisfying: 

r(s,y) = uu |J {/(ti,...,t„) I {h,...,t n } C t(E,V)} 

//n€S 

A substitution over signature E and variable set V is a mapping from variables 
to terms in r(E, V), written {xi/ti, . . . ,x n /t n }. We extend substitutions to map 
terms in the usual way. A unifier for two terms t and t' is a substitution 9 such that 
9(t) and 9(f) are syntactically identical. A most general unifier of two terms t and 
t', denoted mgu(t,t') is a unifier 9 such for every other unifier 9' of £ and t' there 
exists a substitution 9" such that 0' is the composition of 9 with 9". Note that the 
only substitutions we shall deal with are over type and instantiation parameters. 

As we will be dealing with programs, types and instantiations there will be a 
number of signatures of interest. Let V prog be the set of program variable symbols, 
and E tree be the tree constructors appearing in the program, and T, pre d be the 
predicate symbols appearing in the program. Let Vt vpe and E tJ/pe be the type vari- 
ables and type constructors, and similarly let Vi ns t and Ei„ st be the instantiation 
variables and instantiation constructors. Note that these alphabets may overlap. 

An atom is of the form p(s\, . . . , s n ) where {si, . . . , s n } C r(E tree , V prog ) and 
p/n e E pre d. A literal is either an atom, a variable- variable equation x\ = x 2 
where {xi,x 2 } C V progi or a variable-functor equation x = f(x\, . . . ,x n ) where 
/ /n e E tree and x, x\, . . . , x n are distinct elements of V prog . A goaZ G is a literal, a 
conjunction of goals G\, . . . , G n , a disjunction of goals G\; ■ ■ ■ ; G n or an if-then-else 
Gi -> G t ;G e (where Gi,G e7 G t are goals). A predicate definition is of the form 
A : - G where A is an atom and G is a goal. 
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Note that we are assuming the programs have been normalized, so that each 
literal has distinct variables as arguments, each equality is either of the form x\ = X2 
or x — f(xi, . . . , x n ), where distinct variables, and multiple bodies 

for a single predicate have been replaced by one disjunctive body. 

A predicate type declaration is of the form 

:- pred p{t\, . . . ,t n ) 

where {t±, . . . ,t n } C T(Et ype , Vt ype ) are type expressions. A predicate mode decla- 
ration is of the form 

: - mode p(a — > Si, . . . , c„ — > s n ) 

where {c\, . . . , c n , si, . . . , s„} C r(Ej nst ) are ground instantiation expressions. A 
complete predicate definition for predicate symbol p/n G S pr ed consists of a predi- 
cate definition, a predicate type declaration, and a non-empty set of predicate mode 
declarations for p/n. A program is a collection of complete predicate definitions for 
distinct predicate symbols. 

4-2 Tree Grammars 

Tree grammars are a well understood formalism (see, for example, ( |Gecseg and Steinby 1984| 
IComon et al. 1997}) 1 for defining regular tree languages. We first review the stan- 
dard definitions for tree grammars since we shall have to extend these in order to 
handle the complexities of mode checking. 

A tree grammar r over signature E and non-terminal set NT is a finite set of 
production rules of the form x — > t where x G NT and t is of the form f(xi, . . . , x n ) 
where f /n £ S and {x\, . . . , x n } C AT. For each x G AT and //n G E we require 
that there is at most one rule of the form x — ► f(xi, . . . , x n ); hence the grammars 
are deterministic. 

We have chosen to restrict ourselves to deterministic tree grammars: these gram- 
mars are expressive enough for Hindlcy-Milncr types and they give rise to simpler, 
more efficient algorithms — an important consideration for a compiler designed for 
large real- world programs. 

We assume that from a grammar r we can determine its root non-terminal, de- 
noted root(r). In reality this is an additional piece of information attached to each 
grammar. We shall write grammars so that the root non-terminal appears on the 
left hand side of the first production rule in r. 

It will often be useful to extract a sub-grammar r' from a grammar r defining 
some non-terminal x appearing in r. If a: is a non-terminal occurring in grammar 
r, then subg{x, r) is the set of rules in r for x and all other non-terminals reachable 
from x. Or more precisely, subg(x,r) is the smallest set of rules satisfying 

subg(x,r) 2 — > t G r} 

subg{x, r) D {x 1 — > t G r | x' G NT, 3x" — > g(x'{, . . . , x', . . . , x m ) G subg(x, r)} 
The root of the grammar subg(x, r) is x, i.e. root(subg(x, r)) = x. 
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Example 2 

Consider the signature {[]/0, '.'/2, a/0, 6/0, c/0, d/0} and the non-terminal set {abc, 
list(abc), bed, evenlist(bcd), oddlist(bcd)}, then two example regular tree grammars 
over this signature and non-terminal set are r\: 

list(abc) -> [] 

list(abc) — > [abc|list(abc)] 

abc —* a 

abc —* b 



abc — > c 



and j^: 



evenlist(bcd) — ► [] 
evenlist(bcd) — » [bcd|oddlist(bcd)] 
oddlist(bcd) — » [bcd|evenlist(bcd)] 

bed ^ 6 

bed — > c 

bed — » d 

The root non-terminal of r\ is list(abc), while the root non-terminal of r 2 is even- 
list(bcd). The grammar subg(abc, r%) consists of the last three rules of r\ while the 
grammar su6g(oddlist(bcd), r^) includes all of the rules of r 2 but we would write the 
third rule in the first position, to indicate the root non-terminal was oddlist(bcd). 
□ 

A production of form x — > s in some grammar r can be used to rewrite a term 
t G r(S, NT) containing an occurrence of x to the term t' 6 r(S, iVT) where t' is 
obtained by replacing the occurrence of x in t by s. This is called a derivation step 
and is denoted by t =4> t'. We let be the transitive, reflexive closure of =>. The 
language generated by r, denoted by [r], is the set 

{i £ r(S) | root(r) ^* t} 



Example 3 

For example, consider the grammars of Example The set \r{\ is all lists of a's, 
b's and c's, while [r2] is all even length lists of 6's, c's and d's. □ 

For brevity we shall often write tree grammars in a more compressed form. We 
use 

x — * ti; t2] ■ ■ ■ ; t n 

as shorthand for the set of production rules: x — » t%, x — > <2, . . . , x — > t n . 

The [•] function induces a pre-order on tree grammars: n ^ r 2 iff [ri] C [r 2 |. If 
we regard grammars with the same language as equivalent, ^ gives rise to a natural 
partial order over these equivalence classes of tree grammars. In fact they form a 
lattice. However, we shall largely ignore these equivalence classes since all of our 
operations work on concrete grammars. 

We shall also make use of two special grammars. The first is the least tree gram- 
mar, which we denote by _L. We define that [_L] = 0, and so, as its name suggests 
we have that _L ^ r for all grammars r. During mode checking the _L grammar 
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indicates where execution is known to fail. The second special grammar is the error 
grammar, denoted by T. It is used to indicate that a mode error has occurred and 
we define that r < T for all tree grammars r. 

We use n to denote the meet (i.e. greatest lower bound) operator on grammars, 
and U to denote the join (i.e. least upper bound) operator. We assume that the non- 
terminals appearing in the two grammars to be operated on are renamed apart. 

We have that \r\ \lr2l = I r i] ^ I 7 ^]- Because we restrict ourselves to deterministic 
tree grammars the join is inexact: That is to say, [ri U7" 2 ] 3 [ri] U [r2], and for some 
r\ and r 2 , \r\ U r 2 ] ^ [?r] U [r 2 |. Of course, since it is the join, it is as precise as 
possible: for any grammar r such that [r] D [ri] U [ra], we have that [r] D [ri Ur 2 ]]. 

Algorithms for determining if r\ ^ T2, and constructing r\ n r 2 and r\ U r 2 are 
straightforward and omitted. 6 

Example 4 

Consider the grammars r\ and r 2 of Example [5] Their meet r\ n r 2 is: 

meet(list(abc),evenlist(bcd)) — » [] ; [meet(abc,bcd) | meet(list(abc),oddlist(bcd))] 

meet(abc.bcd) — > 6 ; c 
meet(list(abc),oddlist(bcd)) — > [meet(abc,bcd) | meet(list(abc),evenlist(bcd))] 



while their join r\ U r 2 is: 

join (list (a be), even list( bed)) 
join(abc.bcd) 
join(list(abc),oddlist(bcd)) 



[join(abc.bcd) | join(list(abc),oddlist(bcd))] 

b ; c ; d 

[join(abc.bcd) | join(list(abc),evenlist(bcd))] 



Note that the language generated by the grammar r\ U r 2 could be represented with 
fewer rules. In the compiler there is no effort to build minimal representations of 
grammars since non-minimal grammars do not seem to occur that often in practice. 
□ 



4.3 Types 

Types in HAL are polymorphic Hindley-Milner types. Type expressions (or types) 
are terms in the language r(S t j, pe , Vt ype ) where £tj, pe are type constructors and 
variables Vt vpe are type parameters. Each type constructor f/n G S t ype must have 
a definition. 

Definition 5 

A type definition for / /n G ^type is of the form 

:- typedef f{ Vl ,...,v n ) -> (^(tj, . . . , t l mi ); ■ ■ ■ ; f k (tf , . . . , t^J). 

where v±,...,v n are distinct type parameters, {/i/mi, . . . , fk/irik} ^ S tree are 
distinct tree constructor/arity pairs, and t\, . . . , are type expressions involving 
at most parameters v\, . . . ,v n . The type definition for f/n may optionally have 
deriving solver appended. If so then types of the form f(t\, . . . ,t n ) are solver 
types, otherwise they are non-solver types. □ 

6 The final operations of interest are given in the appendix. 
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Clearly, the type definition for / can be viewed as simply a set of production 
rules over signature £t re e and non-terminal set retype, Vtype)- 

We can associate with each (non-parameter) type expression the production rules 
that define the topmost symbol of the type. Let t be a type expression of the form 
f{ti, . . . , t n ) and let f /n have type definition 

:- typedef f(v u ...,v n ) -> (/i(tj, . . . , t^J; ■ ■ ■ ; / fc (tf, . . . , t*J). 

Wc define rules(t) to be the production rules: 

9(f( Vl , . . . , v n )) - (fi(6(tl), . . . , 0(CJ); • • • ; f k (9(t$), 9(t k m J)) 

where 9 = {vi/ti, . . . ,v n /t n }. If t G Vtype we define rules(t) to be the empty set. 

We can extend this notation to associate a tree grammar with a type expression. 
Let grammar(t) be the least set of production rules such that: 

grammar(t) D rules{t) 

grammar(t) D \^J{rules(t') | 3x' — > g(t'i, . . . , t', . . . , t' m ) G grammar(t))} 

We assume that root(grammar(t)) = t. Note at this point we make no distinction 
between solver types and non-solver types; this will only occur once we consider 
instantiations. 

In order to avoid type expressions that depend on an infinite number of types 
we restrict the type definitions to be regular (My croft 1984| ). A type t is regular if 
grammar(t) is finite. 7 

Consider for example the non-regular type definition: 

:- typedef erk(T) -> node (erk (list (T) ) , T) . 

The meaning of the type erk(int) depends on the meaning of the type erk (list (int) ) , 
which depends on the meaning of the type erkdist (list (int) ) ) , etc. By restrict- 
ing to regular types we are guaranteed that each type expression only involves a 
finite number of types. 

A ground type expression t is an element of r(E type ). The grammar corresponding 
to ground type expression t defines the meaning of the type expression as a set of 
trees {\grammar(i)\). Note that during run-time every variable (for each invocation 
of a predicate) has a unique ground type in r(Stj /pe ). 

Example 6 

Given the type definitions: 

:- typedef abc -> a ; b ; c . 

:- typedef list(T) -> [] ; [T I list(T)]. 

then the grammar r\ shown in ExampleElis grammar {list (abc)). The set [ri] is 
the set of lists of a's, b's and c's. The grammar grammar (list (T)) is 

list(T) - ; P1list(T)] 



7 Note that non-regular types are rarely used (although see {Qkasaki 1998|). The compiler could 
be extended to support mode checking for non-regular types as long as we keep the restriction 
to regular instantiations. 
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The set {grammar (list (T))] = {[]}. □ 

Note that the grammars corresponding to non-ground type expressions are not 
very interesting, as illustrated in the above example. We can think of a non-ground 
type expression as a mapping from grounding substitutions to (ground) types whose 
meaning is then given by their corresponding grammar. 

The built-in types float, int, char and string are conceptually expressible as 
(possibly infinite) tree grammars. For example, int can be thought of as having 
the (infinite) definition: 

:- typedef int -> ; 1 ; -1 ; 2 ; -2 ; 3 ; . . . 

Though the infinite number of children will render some of the algorithms on tree 
grammars ineffective this is easily avoided in the compiler by treating the type 
expressions specially (we omit details in our algorithms since it is straightforward) . 

Note that in HAL, type inference and checking is performed using a constraint- 
based Hindley-Milner approach on the type expressions iDemoen et al. 1999J) . In 
this paper we assume that type analysis has been performed previously and there 
are no type errors. For the purposes of mode checking the type correctness of a 
program has four main consequences. First, each program variable is known to 
have a unique polymorphic type. Second, all values taken by a variable during the 
execution are known to be members of this type. Third, calls to a polymorphic 
predicate are guaranteed to have an equal or more specific type than that of the 
predicate. Fourth, all type parameters appearing in the type of a variable in the 
body of a predicate are known to also appear in the type of some variable in the head 
of the predicate. Together, these guarantee that whenever we compare grammars 
during mode checking, they correspond to exactly the same type. 8 This is used 
to substantially simplify the algorithms for mode checking (see, for example, the 
rc-definition of function H in Section 14.61 and the assumption on the existence of 
type environment 9 at the beginning of Section 17.111 . 

4-4 Values 

Types only express sets of fixed values (subsets of r(E 4ree )). However, during exe- 
cution variables do not always have a fixed value and it is the role of mode checking 
to track these changes in variable instantiation. Thus, in order to perform mode 
checking we need to introduce special constants, #fresh^ and ^var^t, to rep- 
resent the two kinds of non-fixed values that a program variable can have during 
execution. 

The #fresh# constant is used to represent that a program variable takes no 
value (i.e., it has not been initialized), and corresponds to the new instantiation. 
Note that in HAL there is no run-time representation for f^-f resh# variables. As 
a result, the compiler needs to know at all times whether a variable is new or not. 

8 Even mode checking a call to a polymorphic predicate will use the calling type, which may be 
more specific than the predicate's declared type. 
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Thus, any tree language including #fresh# and some other term is not a valid 
description of the values of a program variable. 

The #var# constant is used to represent a program variable (or part of a value) 
that has been initialized but not further constrained. It corresponds to a "free" 
variable in the usual logic programming sense. The #var# constructor will arise in 
descriptions of old instantiations, where we can define values which are not fixed. 
Of course it will only make sense for variables of solver types to take on this value. 

The values that a variable can take are thus represented by trees in T(S tree U 
{#var#}) U {#f resh#}. 

4-5 Instantiations 

A type expression by itself represents a set of fixed values. An instantiation by itself 
has little meaning, it is just a term in the language of expressions. Its meaning is 
only defined when it is considered in the context of a type expression. For instance, 
the meaning of ground depends upon the type of the variable it is referring to. 

In the following section we define a function RT(t, i) which takes a type expression 
t and an instantiation i and returns a tree-grammar defining the set of (possibly 
non-fixed) values that a program variable with the given type and instantiation can 
take. In this section we define the function BASE(i, i) which is the function m(t, i) 
for the particular case in which i is a base instantiation. In order to avoid name 
clashes, the function creates a unique non-terminal grammar symbol ti(t, base) for 
the type t and base instantiation base with which it is called and returns this 
together with the grammar for t and base. The symbol ti(t,i) represents the root 
of the tree-grammar which defines the possible values of a variable of type t and 
instantiation i. 

When a program variable is new it can only have one possible value, #f resh#. 
Hence the grammar returned by base(£, new) for any type t is simply 

ti(t, new) — ► #fresh# 

In a slight abuse of notation we will use new to refer to this grammar. 

When a program variable is ground it can take any fixed value. If the type t of 
the variable is ground, then base(£, ground) is identical to the grammar defining 
its type (grammar (t)). Type parameters complicate this somewhat. Since we are 
going to reason about the values of variables with non-ground types we need a way 
of representing the possible ground values of a type parameter. We introduce new 
constants of the form $ground(u)$ where v G Vt vpe to represent these languages. 
So for t G Vtype the grammar base(£, ground) is defined as 

ti(t, ground) — > $ground(t)$ 

For arbitrary types t, base(£, ground) is defined as the union of the rules 

ti(t', ground) — > f(ti(ti, ground), ti(t n , ground)) 

for each t' — > f(t\, . . . ,t n ) occurring in grammar(t), with 

ti(t' , ground) — > $ground(t')$ 
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for each t' £ Vt ype occurring in grammar {t). 

Conceptually, the new constant $ground(v)$ is a place holder for the grammar 
BASE(f', ground) obtained if v were replaced by the ground type t' . 

When a program variable is old it can take any initialized value. This will have 
a different effect on the parts of the type which are solver types themselves and 
on those which are not. Non-solver types do not allow the possibility of taking an 
initialized but unbound value (represented by the value #var#). Thus, for solver 
types t we shall add a production rule t — ► #var# to the usual rules defining the 
type, while non-solver types remain unchanged. In order to handle type parameters 
we introduce another set of constants $old(w)$ where v £ Vtype- Each constant is 
simply a place holder for base(£', old) obtained if v were replaced by the ground 
type t'. Thus, BASE(i, old) for t £ Vt ype is defined as 

ti(t,old) -> $ground(f)$ ; $old(t)S 

and otherwise BASE(i, old) is defined as the rules 

ti(t', old) -> f(ti(ti, old), . . . ,ti(t n , old)) 

for each rule t' — > f{t\, . . . , t n ) in grammar{t), together with 

ti(t', old) -> #var# 

for each solver type t' occurring in grammar (t), and 

ti(t', old) -> $ground(f')$ ; $old(t')$ 

for each type variable t' £ Vty pe occurring in grammar {t). 

The reason we represent an old variable of type t using both the $ground(t')$ 
and $old(i')$, is that then a ground variable of type t defines a sublanguage. This 
will simplify many algorithms. 

Example 1 

Given the type definitions: 
: - typedef abc -> a ; b ; c . 

:- typedef hlist(T) -> [] ; [T I hlist(T)] deriving solver. 
Then olabel = BASE(hlist (abc), old) is the grammar: 

ti(hlist(abc),old) [] ; [ti(abc, old) | ti(hlist(abc), old)] ; #var# 

ti(abc, old) — ► a ; b ; c 

The set [o/a6ci] includes the values [], [a|#var#], [b], [b, a, c, a|#var#]. The sym- 
bol #var# represents an uninstantiated variable, and so the second and fourth 
values are open-ended lists. 

As another example, imagine we swap which type is a solver type. That is, suppose 
we have definitions 

:- typedef habc -> a ; b ; c deriving solver. 
:- typedef list(T) -> [] ; [T I list(T)]. 

Then olabc2 = BASE(list (habc), old) is the grammar: 

ti(list(habc), old) -> [] ; [tt(habc, old) | ti(list(habc), old)] 
ti(habc, old) — » a ; b ; c ; #var# 
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The set [oZa&cS] includes the values [], [a], [#var#, b, #var#] which are all fixed- 
length lists whose elements may be variables. Note that the two occurrences of the 
symbol #var# in the last tree do not necessarily represent the same solver variable. 
Finally BASE(hlist(T), old) is (using the first definition) 

tz(hlist(T),old) ->■ [] ; [ti(T, old) | ti(hlist(T), old)] ; #var# 
ti{T, old) -> $ground(T)$ ; $old(T)$ 

□ 

Let us now consider instantiations in general, rather than only base instan- 
tiations. Instantiation expressions (or instantiations) are terms in the language 
r(Sj nst , Vi ns t) where Y*i nst are instantiation constructors and variables Vinst are 
instantiation parameters. Each instantiation constructor g/n 6 £i ns t must have a 
definition. Often, we will overload functors as both type and instantiation construc- 
tors (so T^ype and T, inst are not disjoint). The base instantiations (ground, old and 
new) arc simply special 0-ary elements of T, inst . 

Definition 8 

An instantiation definition for g is of the form: 

:- instdef g(w u ...,w n ) -> (gi(i{, ■ . . • • ■ ;9k(ii, ■ ■ ■ JmJ)- 

where toi, . . . , w n are distinct instantiation parameters, {gi/mi, . . . , gk/mk} C Ei ree 
are distinct tree constructors, and i\, . . . ,1^, are instantiation expressions other 
than new 9 involving at most the parameters Wi, . . . , w n . Just as for type defini- 
tions, we demand that instantiation definitions are regular. 10 □ 

We can associate a set of production rules rules(i) with an instantiation ex- 
pression i just as we do for type expressions. For the base instantiations we define 
ru/es(new) = rules(old) — ruZes(ground) = 0. 

A ground instantiation is an element of r(Sj ns t). The existence of instantia- 
tion parameters during mode analysis would significantly complicate the task of 
the analyzer. This is mainly because functions to compare type-instantiations or 
to compute their join and meet would need to return a set of constraints involv- 
ing instantiation parameters. Furthermore, predicate mode declarations containing 
instantiation parameters might need to express some constraints involving those 
instantiations. Therefore, for simplicity, HAL (like Mercury 11 ) requires instantia- 
tions appearing in a predicate mode declaration to be ground. As a result, mode 
checking only deals with ground instantiations and, from now on, we will assume 
all instantiations are ground. 

The reason this problem does not arise with type parameters is that, as mentioned 
before, type correctness guarantees that whenever we compare type-instantiations, 
the two types being compared are syntactically identical. Thus, if two type param- 
eters are being compared, they are guaranteed to be the same type parameter. 

9 As mentioned before, disallowing nesting of the new instantiation simplifies mode analysis. It 
also ensures that all subparts of a data structure have a proper representation at run-timc. 

10 It is hard to see how to lift this restriction. 

11 Recently Mercury has added a (as yet unreleased) feature allowing limited non-ground instan- 
tiations in predicate modes. 
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Kl(t,i) 

(r, .) := RT(i,i,0) 
return r 

Kt(t,i,P) 

if {ti(t,i) e P) return (®,ti(t,i))) 

if (i is a base instantiation) return (BASE(t,i), ti(t,i)) 

if (t G Vtype) return (T,_) 

r := 

foreach rule — > f(xu, . . . ,Xi n ) in rules(i) 
if exists rule a; t — > f(x t i, . . . , £t n ) in rules(t) 
for j = l..n 

(r^ay) := RT^t^iij, PU{ti(M)}) 
if (r,- = T) return (T, _) 
endfor 

r := {ti(t,i) -t/(n,...,i„)}UrUr 1 U"'Ur„ 
endif 
endfor 

return (r, i)) 

Fig. 2. Algorithm for computing the type instantiation grammar Rr(i, i) 



4-6 Type- Instantiation Grammars 

In this section we define the function RT(t, i) which takes a type expression t and 
a ground instantiation expression i and returns a type-instantiation tree grammar 
(or ti-grammar). Mode checking will manipulate ti-grammars, built from the types 
and instantiations occurring in the program. 

The function rt defines the meaning of combining a type with an instantiation by 
extending base to non-base instantiations. A non-base instantiation combines with 
a type in a manner analogous to the l~l operation over the rules defining each other. 
Intuitively the function rt intersects the grammars of t and i. This is not really 
the case because of special treatment of type parameters and base instantiations. 

Figure |2] gives the algorithm for computing Ki(t, i). The function Ki(t, i, P) does 
all of the work. It creates a unique grammar symbol ti(t,i) for the type t and 
instantiation i with which it is called and returns this with the type instantiation 
grammar for t and i. Its last argument P is the set of grammar symbols constructed 
in the parent calls: this is used to check that the symbol ti(t, i) has not already been 
encountered and so avoids infinite recursion. The root of the grammar r returned 
is the symbol ti(t,i). 

Note that it is a mode error to associate a non-base instantiation with a parameter 
type t £ Vtype, since we cannot know what function symbols make up the type t. 
In this case the algorithm returns the special T grammar indicating a mode error. 

Example 9 

Consider the types list/1 and habc of Exampledand instantiation nelist/ 1 from 
the program in Figure H Then ti-grammar Rr(list (habc), nelist (old)) is 
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ii(list(habc), nelist(old)) -> [ii(habc, old) | ti(list(habc), list(old))] 
ii(list(habc), list(old)) -> [] ; [ti(habc, old) | ti(list(habc), list(old))] 
ti(habc, old) — * a ; b ; c ; #var# 

while RT(list(T),nelist (ground)) is 

ii(list(T), nelist(ground)) -> [ti(T, ground) |ii(list(T), list(ground))] 
ti(\\st(T), list(ground)) -> [] ; [ti(T, ground) | ti(list(T), list(ground))] 
fi(T, ground) -> $ground(T)$ 

□ 

A ti-grammar is thus a regular tree grammar defined over the signature 
E tree U {$old(v)$, $ground(w)$ | v E V type } U {#var#, #f resh#} 
and non-terminal set 

retype U Sj„ si U {i«/2}, Vtype) U {new} 

Note that by construction the partial ordering and meet and join on tree gram- 
mars extend to ti-grammars including type parameters. As mentioned before, type 
correctness guarantees that during mode checking we will only compare ti-grammars 
for the same type parameter v E Vt ype - For this reason, we only need note that 
RT(t>, ground) < rt(i>, old) for a parameter v E Vt ype , which follows from the con- 
struction since [RT(u, ground)] = {$ground(v)$} and [rt(v, old)] = {$ground(w)$, $old( 
and the meet and join operations follow in the natural way. 

The operations that we perform on ti-grammars during mode checking will be ^< , 
abstract conjunction and abstract disjunction. Abstract conjunction differs slightly 
from l~l since we will be changing variables with a new ti-grammar to ti-grammars 
for bound values (whenever the variable becomes instantiated). The abstract con- 
junction operation A is defined as: 

{ri, where = new 
r 2 , where r± = new 
T\ n r 2 , otherwise 

Abstract disjunction is again slightly different from the U operation. Since the 
compiler needs to know whether the value of a variable is new or not, we need to 
ensure the abstract disjunction operation docs not create ti-grammars (other than 
T) in which this information is lost, i.e., grammars that include ^fresh^ as well 
as other terms. The abstract disjunction operation V is defined as: 

!r\ U r 2 , where n ^ new and r 2 ^ new 
new, where r\ = new and r 2 = new 
T, otherwise 

Finally, we introduce the concept of a type-instantiation state (or ti-state) {x\ i— > 
ri, . . . ,x n i— > r„}, which maps program variables to ti-grammars. Ti-grammars are 
used during mode checking to store the possible values of the program variables at 
each program point. We can extend operations on ti-grammars to ti-states over the 
same set of variables in the obvious pointwise manner. Given ti-statcs TI = {x\ 
ri, . . . ,x n i-> r n } and TI' = {xi i-> r[ , . . . , x n h-> r' n } then: 
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• TI r< IT iff n ^ r£ for all 1 < I < n, 

• TI A IT = {.t/ h-> n A r[ | 1 < / < n} and 

• T7 V TI' = {xi»nVr' l \l<l< n}. 

5 Basic Mode Checking 

Mode checking is a complex process which aims to reorder body literals to satisfy the 
mode constraints provided by each mode declaration. The aim of this is to be able 
to generate specialized code for each mode declaration. The code corresponding 
to each mode declaration is referred to as a procedure, and calls to the original 
predicate are replaced by calls to the appropriate procedure. Recall that before 
mode checking is applied the HAL compiler performs type checking (and inference) 
so that each program variable has a type, and the program is guaranteed to be type 
correct. 

5.1 Well-moded programs 

We now define what it means for a HAL program to be well-moded. 

The execution of a HAL program is performed on procedures which are predicates 
re-ordered for a particular mode. At run-time each type parameter has an associated 
ground type. For our purposes we assume a given type environment 9 (a ground type 
substitution) describes the run-time types associated with each type parameter. 

A call to a procedure p/n in mode p(c\ — > s±, . . . , c n — > s n ) is a type environment 
9 and a value di for each argument 1 < i < n. It follows from type correctness of 
the program that di E [RT(0(ij), old)] U {#f resh^} for each argument 1 < i < n. 

A program is input- output mode- correct if any call to a predicate which is correct 
with respect to the input instantiation for some mode declared for that predicate 
will only have answers that are correct with respect to the output instantiation 
of that mode. More formally, a program is input-output mode-correct if for each 
procedure p/n with declared type p(t±, . . . , t n ) in mode p(c\ — > s%, . . . , c n — > s n ), 
and for any call of the form p(d\, . . . , d n ) with type environment 9 such that di G 
{rt (9 (ti), Ci)], 1 < i < n, it is the case that the resulting values d[, . . . , d' n on success 
of the procedure are such that d' { E {rt (9 (t i), Si)j, 1 < i < n. In other words, the 
declared mode is satisfied by the code generated for the procedure. 

Example 10 

For example the first mode for predicate pop/3, defined in Example ^ 
:- mode pop (in, out , out) is semidet. 

will be shown to be input-output mode-correct by showing that if the first argument 
to pop/3 is ground at call time, and the last two arguments new, then all three 
arguments will be ground on success of the predicate. □ 

A program is call mode-correct if any call to a predicate which is correct with 
respect to the input instantiation for some mode declared for that predicate will 
only lead to calls to literals within the definition of the predicate which are mode- 
correct. More formally, a program is call mode- correct if for each procedure p/n with 
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declared type p(ti, . . . , t n ) in mode p{c\ — > s%, . . . , Cn — > s„), and for any call of the 
formp(cq, . . . , g?„) with type environment 9 such that di G [RT(0(ij), Cj)], 1 < i < n, 
it is the case that each call to a procedure p' /n' with type (given by the occurrence 
in the definition of p/n) p'(t[, . . . , t' n ,) in mode p(c' 1 — > s^, . . . , c^, — > s' t ,) of the 
form p'(d[, . . . , d£J is such that d\ G [RT(0(tQ, cQ], 1 < i < n', and any call to an 
equality of the form x\ — xi is either a copy or unify and any call to an equality 
of the form x — f(x\, . . . , x n ) is either a construct or deconstruct. In other words 
each mode-correct call leads to only mode-correct calls. 

Example 11 

Consider the following code which duplicates the top element of the stack: 

:- pred dupl (list (T) , list(T)). '/, duplicate top of stack 

:- mode dupl (in (nelist (ground) ) , out (nelist (ground) ) ) is det . 

dupl (SO, S) :- SO = [] , S = [] . 

dupl (SO, S) :- push(S0, A, S) , pop(S0, A, SI). 

Showing call mode-correctness for the procedure for dupl/2 involves showing that 
any correct call to dupl/2 (that is with the first argument a non-empty ground 
list, and its second argument new) will call push/3 and pop/3 with correct input 
instantiations for one of their given modes, and each equation must be either a 
construct or deconstruct. □ 

A program is well-moded if it is input-output mode-correct and call mode-correct. 

We shall now explain mode checking by showing how to check whether each 
program construct is schedulable for a given ti-state TI and, if so, what the resulting 
ti-state TI' is. The scheduling also returns a goal illustrating the order of execution 
of conjunctions, and the mode for each equation or predicate call. If the program 
construct is not schedulable for the given ti-state it may be reconsidered after other 
constructs have been scheduled. We assume that before checking each construct 
for an initial ti-state TI, we extend TI so that any variable of type t local to the 
construct is assigned the ti-grammar new. 

5.2 Equality 

Consider the equality x\ = X2 where x\ and x-i are variables of type t and the 
current ti-state is TI — {x% i— » ri,x% >— * f^} U RTI (where RTI is the ti-state for 
the remaining variables). The two standard modes of usage for such an equality are 
copy (:=) and unify (==). If exactly one of n and T2 is new (say ri), the copy 
x\ := X2 can be performed and the resulting ti-state is TI' = {x\ T2,X2 h- > 
U RTI. If both are not new then unify x\ == X2 is performed and the resulting 
instantiation is TI' — {x\ i— > T\ A r2,X2 •— ► r% A r2) U RTI. If neither of the two 
modes of usage apply (i.e. both variables are new), the literal is not schedulable 
(although it might become schedulable after automatic initialization, see Section 
EJ. 

Consider the equality x = f(xi,...,x n ) where variables with 

types {x i— ► t,xi i ► ti,...,x n i— > t n } and current ti-state TI = {a; h r,xi h 
ri, . . . ,x„ I— > r n } U RTI. The two standard modes of usage of such an equality are 
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construct ( : =) and deconstruct (= : ) . The construct mode applies if r is new and 

none of the Tj are new. The resulting ti-state is TV = {x <— » r' , X\ i— ► r\, . . . , x n h- ► 
r„} U RTI where r' is the ti-grammar defined by {a — > f{root(r\), . . . , root(r„))} U 
ri U • • • U r n , where a is a new non-terminal, (i.e. the grammar defining the terms 
constructible from an / with arguments from r%, ...^n respectively). The decon- 
struct mode applies if each rj is new and r is not new and has no production 
rule root(r) — > #var# (which means it is definitely bound to some functor). The 
resulting ti-state is TV ={iHr,iiHrl,...,j; n H r' n } U RTI where r' x , . . . , r' n 
are defined below. If r has a production rule of the form root(r) — ► /(yi, . . . , y n ), 
then the = subg(yj,r),l < i < n. If r has no rule of this form, then the 
resulting ti-state is the same but with r'^ = ±, 1 < j < n, indicating that the 
deconstruct must fail. If some of the variables Xj arc new and some are not 
(say Xk x , . . . , Xk m ) the mode checking process decomposes the equality constraint 
into a deconstruct followed by new equalities by introducing fresh variables, e.g. 
x = f(x\, . . . ,fresh kj , ...),... ,Xkj — fresh k ., . . .. These new equalities are handled 
as above. 

Note that if r = new and some = new then the literal is not schedulable (al- 
though it might become schedulable after automatic initialization, again see Section 

EJ. 

Example 12 

Assume X and Y are ground lists, while A is new. Scheduling the goal Y = [A I X] 
results in the code Y = : [A I F] , X == F. □ 

The above uses of deconstruct are guaranteed to be safe at run-time and corre- 
spond to the modes of usage allowed by Mercury. HAL, in addition to the above, 
allows the use of the deconstruct mode when x is old (i.e. r contains a production 
rule root{r) — > #var#). In this case we check whether r has a production rule of 
the form root(r) — > f(yi, ■ ■ ■ ,y n ) and we proceed as in the previous paragraph. 
Note that this is (the only place) where the HAL mode system is not completely 
strong (i.e. run-time mode errors can occur). The following example illustrates the 
need for this behavior. 

Example 13 

Consider the types abc/0 and hlist/1 from Example [7| the following use of 
append/3 may not detect a mode error until run-time: 

:- pred append(hlist (abc) , hlist(abc), hlist (abc) ) . 
:- mode append(oo, oo, no) is nondet . 
append(X, Y, Z) :- X = [], Y = Z. 

append(X, Y, Z) :- X = [A I XI], append (XI, Y, Zl) , Z = [A|Z1]. 

The equation X = [A I XI] is schedulable as a deconstruct since X is old. However, if 
at run-time X is not bound when append/3 is called, the deconstruct will generate a 
run-time error since A is not a solver variable and, thus, it cannot be initialized. Note 
that if we did not allow deconstruction on old variables then the above predicate 
would not pass mode checking thus preventing mode-correct goals like 

?- X = [a.b.c], init(Y), append (X, Y, Z) . 
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from being compiled. □ 

If we never allow Herbrand solver types to contain non-solver types (as in the 
example above), the problem cannot occur. This gap in mode checking seems un- 
avoidable if we are to allow Herbrand solver types to contain non-solver types. 
However, it seems that in practice this gap is not problematic: in most programs, 
the possibility of a run-time mode error does not exist. Whenever it does, the com- 
piler emits a warning message. In fact, we have never detected a run-time mode 
error. 

5.3 Predicates 

In this subsection we describe the scheduling of predicate calls so that the resulting 
program after scheduling is call mode-correct. 

Consider the predicate call p(xx, . . . , x n ) where each Xi is a variable with type tj. 
Assume p has the mode declaration p{c\ — * si, . . . , c„ —y s n ) where Cj, Sj are the 
call and success instantiations, respectively, for argument j, and the current ti-state 
is TI = {xi i— y n, . . . , x„ h- ► r„} U RTI. 

Note that the handling of polymorphic application is hidden here, since the type 
ti of the variable Xi is type in the calling literal p(x\, . . . , x n ), which may be more 
specific than the declared/inferred type of argument i of p. Because instantiations 
are separate from types this is straightforwardly expressed by constructing the ti- 
grammar for the mode specific calling type ti and the appropriate instantiations. 

The predicate call can be scheduled if for each 1 < j < n the current ti-state re- 
stricts the j-th argument more than (defines a subset of) the calling ti-state required 
for p, i.e. rj < RT(ty, Cj). If the predicate call is schedulable for this mode the new 
ti-state is TI 1 = {x% i-v r% A RT(f i, si), . . . , x n h-> r n A RT(t n , s n )} U RTI. The predi- 
cate call can also be scheduled if for each j such that rj 3? RT(ty, cy) then Kr(tj,Cj) = 
new. For each such j, the argument Xj in predicate call p(x± , . . . , Xj-\ , Xj , Xj+i , . . . , x n ) 
is replaced by freshj, where freshj is a fresh new program variable, and the equa- 
tion freshj — Xj is added after the predicate call. Such "extra" modes are usually 
referred to as implied modes. 

Example 14 

Consider the goal empty (SO) for the program of Figure ^ where the type of SO 
is given by {SO t—y list(abc)} (which is more specific than the declared type 
list(T)) and the current ti-state is TI = {SO new}. The two modes for empty 
(in expanded form) are 

:- mode empty (ground -> ground) is semidet. 
:- mode empty (new -> ground) is det . 

The first mode of empty cannot be scheduled since new Rr(list (abc) , ground), 
but the second mode can be scheduled, since new < RT(list(abc),new) = new. □ 

If more than one mode of the same predicate is schedulable, in theory the compiler 
should try each possibility. Unfortunately, this search may be too expensive. For this 
reason, HAL (like Mercury) chooses one schedulable mode and commits to it. This 
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behavior might lead to the compiler failing to check a mode-correct procedure (see 
Example I27f) . In order to minimize this risk, we choose a schedulable mode whose 
success ti-state TI defined as {x\ i— > rt(£i, S\), . . . , x n i— » RT(i„,s„)} is minimal; 
that is, for each other schedulable mode with success ti-state TI' it is the case that 
TI' 2^ TI. Note that there may be more than one mode with a minimal success ti- 
state. In the case that we have more than one mode with the same minimal success 
state then we use a mode with a minimal call ti-state. 

Example 15 

Consider the scheduling of the goal pop(A,B,C) where current ti-state is TI — 
{A i — ► T\ , B i — > new, C i— ► r^} where n and r% are defined by the grammars 

ri — > [ra | ra] 
T2 — ► 6 

^3 - 

That is A = [6] and C — []. Neither of the declared modes for pop, shown below, 
are immediately applicable. 

:- mode pop (in, out , out) is semidet . 

:- mode pop(in(nelist (ground) ), out , out) is det . 

But both modes fit the conditions for an implied mode. Since the second mode has 
a more specific success ti-state (the first argument is known to be non-empty) it 
is chosen. The resulting code is pop_mode2(A,B,Fresh) , Fresh = C, where mode 
checking will then schedule the new equation appropriately. □ 

The idea is to maintain as much instantiation information as possible, thus re- 
stricting as little as possible the number of schedulable modes for the remaining 
literals. In our experience with compiling real programs this policy seems adequate 
to avoid any problems. It is straightforward, but in practice too expensive, to im- 
plement a complete search for all possible schedules. 

5-4 Conjunctions, Disjunctions and If-Then-Elses 

To determine if a conjunction G-y, .. . , G n is schedulable for initial ti-state TI we 
choose the left-most goal Gj which is schedulable for TI and compute the new ti- 
state TIj. This default behavior schedules goals as close to the programmer given 
left-to-right order as possible. If the state TIj assigns _L to any variable, then the 
subgoal Gj must fail and hence the whole conjunction is schedulable. The resulting 
ti-state TI' maps all variables to _L, and the final conjunction contains all previously 
scheduled goals followed by fail. If TIj does not assign J_ to any variable we 
continue by scheduling the remaining conjunction G±, . . . , Gj—i, Gj+i, . . . , G n with 
initial ti-state TIj. If all subgoals are eventually schedulable we have determined 
both an order of evaluation for the conjunction and a final ti-state. 

Example 16 

Consider scheduling the goal 

Y = [U1IU2], U2 = [], X = [U1IU3]. 
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where X is initially RT(list (T) , ground), and the remaining variables are new. The 
first literal is not schcdulablc and will remain so until both Ul and U2 are no 
longer new. We consider then the second literal, which is schedulable as a con- 
struct, thus changing the type-instantiation of U2 to RT(list (T) , elist). Since the 
first literal remains unschedulable, we consider the third literal which is schedu- 
lable as a deconstruct, thus changing the type-instantiation of X, Ul and U3 to 
RT(list(T),nelist (ground)), rt(T, ground) and Rr(list (T), ground), respectively. 
Since both Ul and U2 arc no longer new, the first literal is now schedulable as a 
construct. The resulting code is 

U2 := [], X =: [U1|U3], Y := [U1|U2]. 

In the final ti-state the instantiation of Y is given by the tree-grammar 

Y -> [ti(T, ground) |*z(list(T), elist)] 
ti(\\st(T), elist) Q 
ti(T, ground) -> $ground(T)$ 

in other words it is a list of length exactly one. □ 

To determine if a disjunction Gi; • ■ ■ ;G n is schedulable for initial ti-state TI 
we check whether each subgoal Gj is schedulable for TI and, if so, compute each 
resulting ti-state TIj, obtaining the final ti-state TI' = Vje{i n }TIj- If this ti- 
state assigns T to any variable or one of the disjuncts Gj is not schedulable then 
the whole disjunction is not schedulable. 

To determine whether an if-then-else Gi — ► Gt ; G e is schedulable for initial ti- 
state TI, we determine first whether Gi is schedulable for TI with resulting ti-state 
Tli. If not, the whole if-then-else is not schedulable. Otherwise, we try to schedule 
G t in state Tli (resulting in state TI t say) and G e in state TI (resulting in state TI e 
say). The resulting ti-state is TI' = TI t VT7 e . If one of Gt or G e is not schedulable 
or TI' includes T the whole if-then-else is not schedulable. Note that the analysis 
of d — > Gt; G e is identical to that of (Gi,G t ); G e except that all goals of Gi must 
be scheduled before those of G t - 

5.5 Mode Declarations 

In this subsection we discuss how mode-correctness is checked for each mode dec- 
laration. 

To check that a predicate with head p(x\, . . . , x n ) and declared (or inferred) type 
{x\ i— > ii, . . . , x n i— ► t n } satisfies the mode declaration p(c\ — ► si, . . . , c n — ► s n ), we 
build the initial ti-state TI — {x\ ^ KT(t\, ci), . . . , x n i— ► RT(t„, c„)}. The body of 
the predicate is then analyzed starting from the state TI. The mode declaration is 
correct if (a) everything is schedulable and (b) if the final ti-state is TV = {x\ ^ 
r[, . . . ,x n i — ^ r' n } 7 then for each argument variable 1 < i < n, r[ < KT(ti,Si). If 
the body is not schedulable or the resulting instantiations are not strong enough, 
a mode error results. Note that (a) ensures that the predicate is call mode-correct 
for that mode while (b) ensures that it is input-output mode-correct. 
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Example 17 

Consider mode checking of the following code from Example 1111 which makes use 
of the code in Figure ^ 

:- pred dupl (list (T) , list(T)). '/, duplicate top of stack 

:- mode dupl (in (nelist (ground) ) , out (nelist (ground) ) ) is det . 

dupl (SO, S) :- SO = [] , S = [] . 

dupl (SO, S) :- push(S0, A, S) , pop (SO, A, SI). 

We start by constructing the initial ti-state TI — {SO i— » gnelT, S i— » new} where 
gnelT — RT(list (T) , nelist (ground)) is the ti-grammar shown in Example 
Checking the first disjunct (rule) we have SO = [] schedulable as a deconstruct. 
The resulting ti-state assigns _L to SO, and thus the whole conjunction is schedula- 
ble with TIi = {SO i — ► _L, S i — ► _L}. Checking the second disjunct, we first extend 
TI to map A and SI to new. Examining the first literal push(S0, A, S) we find 
that it is not schedulable since A has instantiation new and is required to be ground. 
Examining the second literal pop (SO, A, SI) we find that both modes declared 
for pop/3 are schedulable. Since the second mode has more specific success instan- 
tiations, it is chosen and the ti-grammars for A and SI become Rt(T, ground) and 
RT(list (T) , ground), respectively. Now the first literal is schedulable obtaining for 
S the ti-grammar gnelT. Restricting to the original variables the final ti-state is 
TI 2 = {SO i ► gnelT, S i-> gnelT}. Taking the join TI' = Th\/TI 2 = TI 2 . Checking 
this against the declared success instantiations we find the declared mode is correct. 
The code generated for the procedure is: 

dupl_model(SO, S) :- fail. 

dupl_model(SO, S) :- pop_mode2 (SO , A, SI), push_mode 1 (SO , A, S) . 

where pop_mode2/3 and push_model/3 are the procedures associated with the sec- 
ond and first modes of the predicates, respectively. □ 

Note that the HAL compiler's current mode analysis does not track variable 
dependencies and thus it may obtain a final type-instantiation state weaker than 
expected. 

Example 18 

Consider the solver type habc/0 of Example The following program does not 
pass mode checking: 

:- pred p(list (habc) , habc) . 

:- mode p(list(old) -> ground, in) is semidet . 
p(L, E) : - L = [] . 

p(L, E) : - L = [El I LI], E = El, p(Ll, E) . 

The first literal of the second rule is a deconstruct. After that deconstruct variable L 
is never touched and hence its instantiation is never updated; in particular it is not 
updated when the instantiation of El and LI change. The inferred type- instantiation 
for L at the end of the second rule is thus Rr(list (habc), nelist (old)) rather than 
RT(list(habc), nelist (ground)). Hence, mode checking fails. □ 
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This could be overcome by adding a definite sharing analysis and/or a depen- 
dency based groundness analysis to the mode checking phase. Whenever a variable 
which definitely shares with another (through an equation e) is touched, we modify 
the resulting ti-state as if the equation e has been rescheduled to update sharing 
variables. This is (partially) implemented, for example, in the alias branch of the 
Mercury compiler. 

6 Automatic Initialization 

As mentioned before, constraint solvers must provide an initialization procedure 
(init/1) for their solver type. This procedure takes a solver variable with in- 
stantiation new and returns it with instantiation old, after initializing whatever 
data-structures (if any) the solver needs. 

Many of the predicates exported by constraint solvers (including most con- 
straints) require the solver variables appearing as arguments to be already ini- 
tialized. Thus, explicit initializations for local variables may need to be introduced. 
Not only is this a tedious exercise for the programmer, it may even be impossible for 
multi-moded predicate definitions since each mode may require different initializa- 
tion instructions. Therefore, the HAL mode checker automatically inserts variable 
initializations. In particular, whenever a literal cannot be scheduled because there 
is a requirement for an argument of type t to be RT(i, old) when it is new and t 
is a solver type, then the init/1 predicate for type t can be inserted to make the 
literal schedulable. 

Example 19 

Assume we have an integer solver with solver type cint/0. 

:- pred length(list (cint) , int) . 
:- mode length(out (list (old) ) , in) is nondet . 
:- mode length (in (list (old) ) , out) is det . 
length (L, N) :- L = [] , N = 0. 

length (L, N) :- L = [X|L1], +(N1,1,N), N > 0, length (LI, Nl) . 

where the predicate +(X,Y,Z) models X + Y = Z and requires at least two argu- 
ments to be ground on call and all arguments are ground on return. 

For the first mode L = [X | LI] cannot be scheduled as a construct until X has a 
ti-grammar different from new. Hence, X needs to be initialized. In the second mode 
L = [X I LI] can be scheduled as a deconstruct and thus no initialization is needed. 
The two resulting procedures are: 

length_model(L, N) :- (L := [] , N == 

; +cmtm»n(Ni, 1, N), N > 0, length_model (LI , Nl) , init(X), L := [X|L1]). 
length_mode2(L, N) : - (L == [] , N : = 

; L =: [XI LI], length_mode2(Ll, Nl) , + ininmlt (Nl, 1, N), N > 0). 

where we have rewritten the call to +/3 to show the mode more clearly (+ utinin 
indicates that the first argument is out and the rest are in, +i n inout indicates that 
the third argument is out and the other arguments are in). □ 
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Unfortunately, unnecessary initialization may slow down execution and introduce 
unnecessary variables (when it interacts with implied modes). Hence, we would like 
to only add those initializations required so that mode checking will succeed. The 
HAL mode checker implements this by first trying to mode the procedure without 
allowing initialization. If this fails it will start from the previous partial schedule 
looking for the leftmost unscheduled literal I which can be scheduled by initializing 
variables which (a) have either a solver type or a parameter type (e.g. v £ Vty pe ) 
and (b) do not appear in an unscheduled literal to the left which equates them to a 
term (if so, chances are the equation will become a construct and no initialization is 
needed). If such an I is found the appropriate initialization calls are inserted before 
I, and then scheduling continues once more trying to schedule without initialization. 
If no I is found the whole conjunct is not schedulable. This two phase approach is 
applied at each conjunct level individually. 

Example 20 

Consider the following program where cint/0 is a solver type: 

:- instdef evenlist(T) -> ( [] ; [T I oddlist (T)] ) . 
:- instdef oddlist (T) -> [T I evenlist (T)] . 

:- pred pairlist (list (cint) , int) . 

:- mode pairlist (out (evenlist (old) ), in) is nondet . 
pairlist (L,N) :- N = 0, L = [] . 

pairlist(L.N) :- N > 0, +(N1,1,N), L = [V|L1], LI = [V|L2], pairlist (L2 , Nl) . 

In the first phase all literals in the second rule are schedulable except L = [V|L1] 
and LI = [V|L2] which can be neither a construct (V, LI and L2 are new) nor 
a deconstruct (both L and LI are new). In the second phase we examine the two 
remaining unscheduled literals: the second literal can be scheduled by initializing 
V. Once this is done the first literal can be scheduled obtaining: 

pairlist (L,N) :- N == 0, L := [] . 

pairlist(L.N) :- N > 0, + ou tinin (Nl , 1 ,N) , pairlist (L2.N1) , 
init(V), LI := [V|L2] , L := [VI LI]. 

□ 

Many other different initialization heuristics could be applied. We are currently 
investigating more informed policies which give the right tradeoff between adding 
constraints as early as possible, and delaying constraints until they can become 
tests or assignments. 

7 Higher-Order Objects 

Higher-order programming is particularly important in HAL because it is the mech- 
anism used to implement dynamic scheduling, which is vital in CLP languages for 
extending and combining constraint solvers. Higher-order programming introduces 
two new kinds of literals: construction of higher-order objects and higher-order calls. 
A higher-order object is constructed using an equation of the form h = p(x\, . . . , Xk) 
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where h, x±, . . . , Xk are variables and p is an n-ary predicate with n> k. The vari- 
able h is referred to as a higher-order object. Higher-order calls are literals of the 
form call(ft, Xk+i, ■ • ■ , x n ) where h, Xk+i, ■ ■ ■ ,x n are variables. Essentially, the call 
literal supplies the n — k arguments missing from the higher-order object h. 

In order to represent types and instantiations for higher-order objects we need 
to extend the languages of type and instantiation expressions. The higher-order 
type of a higher-order object h constructed in the previous paragraph is of the 
form pred(tk+i 1 ■ ■ ■ ,t n ) where pred/(n — k) is a new special type constructor and 
ifc+i, ■ ■ ■ ,t n are types. It provides the types of the n — k arguments missing from h. 
The higher- order instantiation of h is of the form pred(ck+i — > Sfe+i, ■ • • , c„ — > s n ) 12 
where pred/(n — k) is a new special instantiation construct and Cj — > Sj give the 
call and success instantiations of argument j respectively. It provides the modes 
of the n — k arguments missing from h. Note that for the first time we allow new 
instantiations appearing inside instantiation expressions (since they will often be 
call instantiations). But their appearance is restricted to the outermost arguments 
of higher-order instantiations. 

Now we must extend the RT(t,i) operation to handle higher-order types and 
instantiations. Let us first consider the case in which i is the higher-order instantia- 
tion pred(ck+i — ► Sfc+i, • • • , c n —> s n ). If t is the higher-order type pred(tk+i, • • • , t n ) 
then Ki(t, i) returns the grammar 

ti(t,i) — ¥ $±-pTed$(root(tCk+i),root(tSk+i), ■ ■ ■ ,root(tc n ),root(ts n )) 

together with the grammars tck+i, • ■ • , tc n , tsk+i, ■ ■ ■ , ts n where tej = KT(tj,Cj) 
and tSj — RT (tj,Sj). If t is not a higher-order type or has the wrong arity then 
Kr(t, i) = T, indicating an error. The new constant $ipred$ simply collects the call 
and success ti-grammars for the higher-order object's missing arguments. 

The extension of RT(t, i) for the case of base instantiations i is similar to the 
treatment of type parameters. A higher-order object can be new or ground, but 
if it is old this is identical to ground since higher-order objects never have an 
attached solver. RT(pred(t\, . . . , t n ), new) is treated as before (i.e. it creates a new ti- 
grammar). Similarly RT(pred(ti, . . . , t n ), ground) generates a production rule using 
a new constant $gpred$ of the form 

ti(pred(ti, . . . , t n ), ground) — > $gpred$ 

RT(pred(ti, . . . , t n ), old) generates the same grammar (since it is equivalent). Since 
we will only compare the higher-order ti-grammar against other ti-grammars for the 
same type we can safely omit the information about the argument types (ti, . . . ,t„). 

The new constant $gpred$ acts like $ground(w)$ but it can also be compared 
with more complicated ti-grammars (with production rules for function symbol 
$ipred$) of the same type. The full code for RT(t, i) is given in the appendix. 

Example 21 

Consider the following code: 

12 In reality, the determinism information also appears in the higher-order instantiation; for sim- 
plicity we ignore it here. 
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:- pred map(pred(Tl,T2) , list(Tl), list(T2)). 

:- mode map (in (pred (in, out) is det) , in, out) is det . 

map(H, [], []). 

map(H, [A I As], [B|Bs]) : - call(H, A,B) , map(H, As ,Bs) . 

:- typedef sign -> (neg ; zero ; pos) . 

:- pred mult(sign, sign, sign). 
:- mode mult (in, in, out) is det. 

?- HI = mult(pos), map(Hl, [neg, zero, pos] , LI). 

The map/3 predicate takes a higher-order predicate with two missing arguments of 
parametric types Tl and T2 and modes in and out, respectively. The ti-grammar 
describing the input instantiation of the first argument of map/3 is the grammar 
with root ai = ti(pred(Tl,T2), pred(in.out)), defined by 

oi — ► $ipred$(ti(Tl, ground), ti(Tl, ground), new, ti(T2, ground)) 
tz(Tl, ground) -> $ground(Tl)$ 
ti(T2, ground) -> $ground(T2)$ 
new — > #fresh# 

This predicate is applied to a list of Tls, returning a list of T2s. The literal 
HI = mult (pos) constructs a higher-order object which multiplies the sign of its 
first argument by pos, returning the result in its second argument. The type- 
instantiation of HI, RT(pred(sign, sign), pred(in, out)), is the grammar with 
root a 2 = ti(pred(sign,sign), pred(in.out)) and rules: 

<i2 — > $ipred$(ii(sign, ground), £i(sign, ground), new, ti(sign, ground)) 
£i(sign, ground) — > neg ; zero; pos 
new — > #fresh# 

□ 

We need to extend the ordering ^ to higher-order type-instantiations as well as 
the operations A and V. Two higher-order ti- grammars r and r' defined with rules 

root(r) — ► $ipred$(xci,a;si, . . . ,xc n ,xs„) 

and 

root(r') — > $ipred$(xci, xs'x, . . . , xc' n , xs' n ) 

satisfy r ■< r' iff for i = 1, ...,n, subg(xc' i ,r') ^ subg(xci,r) and subg(xsi,r) ■< 
subg(xs' i7 r'). Intuitively, if r ^ r', then any higher-order call(r', . • ■) should be 
replaceable by call(r, . . .). For this to work, two conditions must be fulfilled. First, 
r must be able to deal with any values that r' can deal with (and perhaps more). 
Thus, subg(xc' i , r') < subg(xci, r). And second, r must return the same values as r' 
or less. Thus, subg(xsi,r) ^ subg(xs' i ,r'). For more details see the example below. 

We define r ^ RT(pred(fi, . . . , t n ), ground) for any ti-grammar r of the appro- 
priate type except new. The full definition of ^ is given in the appendix. The A and 
V operations follow naturally from the ordering, and are given in the appendix. 
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Example 22 

Consider the following code and goal: 

: - typedef abc -> a ; b ; c . 
:- instdef ab -> a ; b. 

:- pred hoi (abc, abc) . 

:- mode hoi (in(ab) , out (ab) ) is det . 

hol(A,B) :- A = B. 

:- pred ho2(abc,abc) . 
:- mode ho2(in,out) is det. 
ho2(A,B) :- A = a, B = b. 
ho2(A,B) :- A = b, B = c. 
ho2(A,B) :- A = c, B = a. 

?- HOI = hoi, H02 = ho2, (HO = HOI ; HO = H02) . 

During scheduling of the disjunction, the ti-grammar for HD1 is RT(pred(abc , abc) , 
pred(in(ab) ,out(ab))), i.e.: 

hoi — > $ipred$(gndab, gndab, new, gndab) 
gndab — > a ; b 
new — > #fresh# 

and the ti-grammar for H02 is 

ho2 — > $ipred$(gndabc, gndabc, new, gndabc) 
gndab — > a ; b ; c 
new — ► #fresh# 

The abstract disjunction of these two grammars to build the ti-grammar for HO 
gives 

ho — > $ipred$(gndab, gndabc, new, gndabc) 

Notice the call ti-grammars have been abstractly conjoined. This illustrates the 
contravariant nature of calling instantiations of higher-order predicates. The higher- 
order object in HO can only be safely applied to an input a or b since it may be 
predicate hoi. It can only be guaranteed to give output a,b or c since it may be 
predicate ho2. □ 

7.1 Scheduling Higher- Order 

Intuitively, a higher-order equation h = p(x\, . . . , Xk) is schcdulable if h is new 
and x\, . . . ,Xk are at least as instantiated as the call instantiations of one of the 
modes declared for p/n. If this is true for more than one mode, we again choose one 
schedulable mode (using the same criteria used for calls to first order predicates) 
and commit to it. If it is not true for any mode, the equation is delayed until the 
arguments become more instantiated. Formally, let the current ti-state be TI = 
{/i n r,ii <—*n,...,Xk fk\ U RTI and the types {x\ i—*ti,...,Xk * tk\- Let 
the (declared or inferred) predicate type of p/n be p(dt\, . . . , dt n ), then (because of 
type correctness) we have that there exists 9 such that 6{dtj) = tj. 
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Consider the declared mode p{c\ — > si, . . . , c„ — > s„). The higher-order equation 
is schedulable if r = new and for each 1 < j < k, r 3 ^ RT(ij, Cj) A rj ^ new. The 
resulting ti-state is 

{/mr'jii i— ► n, . . . ,a;fe i— ► rfe} U J?TI 

where tCj — Kr(0(dtj), Cj) and tsj — rt(0 (dtj), s 3 ) for k + 1 < j < n and r' = 
{a — ► $ipred$(root(tefc+i), root(tsk+i), ■ ■ ■ , root(tc n ), root(ts n ))}, where a is a new 
non-terminal, together with the grammars for tcfc+i, tsfe+i, . . . , tc n , ts n . 

Note that the instantiation of each Xj is unchanged and, in fact, will not be 
updated even when h is called. This is because in general we cannot ensure when 
or if the call has actually been made. As a result, mode checking with higher-order 
objects can be imprecise. In particular, if one of the rj is new we may not know if it 
becomes initialised or not since we do not know if the call to h which will initialise 
it has been made. Since we must be able to precisely track when a variable has 
become initialised, we do not allow a call to be scheduled if this is the case (hence 
the rj ^ new condition above). 

A higher-order call call(/i, Xk+i, ■ ■ ■ , x n ) is schedulable if Xk+i, ■ ■ ■ ,x n are at 
least as instantiated as the call instantiations of the arguments of the higher-order 
type-instantiation previously assigned to h. If this is not true, the call is delayed 
until the arguments become more instantiated. Formally, let the current ti-state be 
j x k+i 1 fk+it ■ ■ ■ 7 x n 1 * ^n} U RTI. The call is schedulable if r has a 
production rule of the form 

root(r) — > $ipred$(xcfc+i, xs k+u . . . , xc n ,xs n ) 

and for each j 6 k + l,...,n, r 3 ^ subg(xcj,r). The resulting instantiation is 
TP = \h i ► r, Xk+i i— > rk+i A subg(xsk+i,r), . . . , x n i— > r n A subg(xs n ,r)} U RTI. 
Just as for normal predicate calls, implied modes are also possible where if, for 
example, xq is new, we can replace xi with a fresh variable fresh t and a following 
equation fresh t — x\ . And, if necessary, the mode checker will add calls to initialise 
solver variables. 

Example 23 

Consider the following code and assume all goals are schedulable in the order writ- 
ten: 

:- instdef only_a -> a. 

: - modedef abc2a -> (ground -> only_a) . 

:- pred p(abc, abc, abc) 

:- mode p(abc2a, in, out (only_a) ) is semidet. 

?- Gi, p(A,B,C) , G 2 . 

?- Gi, H = p(A), call(H,B,C), G 2 ■ 

The two queries would appear to have identical effects. However, mode checking for 
the second goal will not determine that the instantiation for A becomes only_a by 
the time it reaches goal Gi . Assuming A was ground before H = p ( A) , then the type- 
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instantiation of H is the grammar with root x = ti(pred(abc, abc),prec?(in, out(only_a))) 
and rules: 

x — > $ipred$(tt(abc, ground), ti(abc, ground), new, ii(abc, only_a)) 
ti(abc, ground) — » a ; b ; c 
ti(abc, only_a) — > a 

new — > #fresh# 

Of course in this case it is obvious that the predicate is being called before G2 , and so 
it could be inferred that the instantiation of A was only_a at that point. However, in 
the usual case such analysis is harder, since the construction of a higher-order term 
and its eventual execution are usually performed in different predicates. Indeed, in 
general it is impossible to know at compile time whether at a given program point 
the higher-order predicate has been executed or not. □ 



8 Polymorphism and Modes 

Polymorphic predicates are very useful because they can be used for different types. 
Unfortunately, mode information can be lost since only the base instantiations 
ground, new, and old can be associated with type parameters. 

Example 24 

Consider the interface to the stack data type defined in Figure ^ and the following 
program: 

: - pred q(abc) . 

:- mode q(in) is semidet . 

:- mode q(in(only_a) ) is det . 

?- empty (SO), 10 = a, push(S0, 10, SI), pop(Sl, I, S2) , q(I) . 

Although list SI is indeed a list only containing items a this information is lost after 
executing push since the output instantiation declared for this predicate is simply 
ground. Because of this, the first mode of predicate q/1 will be selected for literal 
q(I), thus losing the information that q(I) could not fail. □ 

This loss of instantiation information for arguments to polymorphic predicates 
may have severe consequences for higher-order objects because the base instantia- 
tion ground applied to polymorphic code does not contain enough information for 
the higher-order object to be used (called). 

Example 25 

Consider the following goal using code from Figure ^and Example 12 II 

?- empty(SO), 10 = mult(pos), push(S0, 10 ,S1) , pop(Sl , I ,S2) , map(I , [neg] ,S) . 

When item I is extracted from the list its ti-grammar is Rt(£, ground) where t is type 
pred(sign, sign). As a result, it cannot be used in map since its mode and determinism 
information has been lost, i.e. the check RT(i, ground) ^ RT(i, pred(in,out)) fails. □ 

We could overcome the above problem by having a special version of each stack 
predicate to handle the higher-order predicate case. But this requires modifying the 
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COLLECT_SET(ri ,r2 ,P) 

x\ := root(ri); x 2 := root(r2) 
if ((2:1,2:2) G P) return 
if (xj = new) return 

if (xi — » $old(f)$ 6 ri) return {(old, v,r 2 )} 

if (2:1 — > $ground(t;)$ £ ri) return {(ground, u, r 2 )} 

M := 

foreach rule xi — > /(in, . . . , in n 
if exists rule x 2 — > f{x2i, ■ ■ ■ , 2:2™) in r2 
for i := l..n 

M := M U COLLECT_SBT(sw63(a;ii,ri), subg(x 2l ,r 2 ), P LI {(x ly x 2 )}) 
return M 

Fig. 3. Algorithm for collecting the type-instantiations that match type parameters. 

stack module, defeating the idea of an abstract data type. Also, this modification 
is required for each mode of the higher-order object that the programmer wishes 
to make use of. Clearly, this is not an attractive proposition. 

Our approach is to use polymorphic type information to recover the lost mode 
information. This is an example of "Theorems for Free" IjWadler 1 989): since the 
polymorphic code can only equate terms with polymorphic type, it cannot create 
instantiations and, thus, the output instantiations of polymorphic arguments must 
result from the calling instantiations of non-output arguments. Hence, they have to 
be at least as instantiated as the join of the input instantiations. 

8.1 Polymorphic Mode Checking 

To recover instantiation information we extend mode checking for procedures with 
polymorphic types to take into account the extra mode information that is im- 
plied by the polymorphic type. Consider the predicate call p{x\, . . . ,x n ) where 
ati, ...,x n are variables with type {x\ 1— » t±, ...,x n >— > t n } and current ti-state 
TI = {xi 1 > n,...,af n i— > r n } U RTI. Suppose the predicate type declared (or 
inferred) for p is p(dt±, . . . , dt n ). Note that because of type correctness there exists 
the type substitution 8 where O(dtj) = tj. 

Assume the literal is schedulable for mode declaration p(c\ — > Si, . . . , Cn — > s n ). 
We proceed by matching the ti-trees RT(dtj,Cj) against the current instantiations 
rj in a process analogous to the matching that occurs in the meet function. Note 
that Kr(dtj,Cj) is the ti-grammar which contains information on the positions of 
type parameters in the declared type of p. 

Consider the function COLLECT _SET(ri, f2, 0), defined in Figure which re- 
turns the set of triples (old, v,r > ) and (ground, v, r') obtained by collecting each 
ti-grammar, r', in r2 matching occurrences of $old(w)$ and $ground(w)$ in r\. Let 
M = U" =1 COLLECT_SET(RT(dij, Cj), Tj, 0). We will use this information to compute 
the success instantiations as follows: since the only success type-instantiation infor- 
mation for elements of parametric type v must come from its call type- instantiations, 
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we can safely assume that any success type-instantiation is at least as instantiated 
as the join (upper bounds) of the calls. 

Note that when determining ground success information, we need only consider 
ground calling instantiations, since ground success instantiations cannot result from 
old call instantiations. On the other hand, for old success information, we need to 
consider both old and ground calling instantiations, since old success instantia- 
tions can result from either. Hence the following definitions for ground(v, M) and 
old(v,M), which compute upper bounds on success instantiations for v based on 
the call instantiation information collected in M: 

ground(v, M) — \J {r | (ground, v, r) £ M} 

old(v,M) = W{r | (ground, v, r) £ M or (old, v,r) £ M} 

Because the literal is schedulable for the given mode we know that no r, contains 
new for any t. Thus, the abstract disjunctions in ground(v, M) and old(v, M) never 
lead to T. 

Let psj be the result of replacing in Kr(dtj, Sj) each non-terminal x with produc- 
tions of the form 

x — * $ground(v)$ ; $old(w)$ 

by root(old(v , M)) and removing the rules for x, and replacing each non-terminal 
x with productions of the form 

x — ► $ground(u)$ 

by root(ground(v , M)) and removing the rules for x, and finally adding the rules 
in old(v,M) and ground(v, M). The new ti-state resulting after scheduling the 
polymorphic literal is TV = {x\ t—> r± A ps\, . . . , x n t— > r n A ps n } U RTI. 

Example 26 

Assume we are scheduling the push/3 literal in the goal using code from Figure ^ 
and Example 1211 

?- empty(SO), 10 = mult(pos), push(S0, IO.Sl) , pop(Sl,I,S2) , map(I, [neg] ,S) . 

for current ti-state {SO i— > r^, 10 f— > r^}, the remaining variables being new, where 
r3 is the grammar 

ti(list(sign), elist) — > ] 
and is the grammar with root a = ti(pred(s\gn, sign), pred(\n, out)) defined by 

a — » $ipred$(ti(sign, ground), ti(sign, ground), new, ti(sign, ground)) 
ti(sign, ground) — > neg ; zero ; pos 

The ti-grammars defined by the declared type and mode declarations for the first 
two arguments of push/3 are: r$ = RT(list (T) , ground) or the grammar 

ti(list(T), ground) —> [] ; [ti(T, ground) | ti(list(T), ground)] 
ti(T, ground) -» $ground(r)$ 

and, r 6 = Rt(T, ground), the grammar 
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ti(T, ground) -> $ground(T)$ 

The literal is schedulable and the matching process determines that COLLECT_SET(r 5 , r%) = 
0, COLLECT_SET(r 6 ,r 4 ) = {(ground, T, r 4 )} and M = {(ground, T, r 4 )}. The im- 
proved analysis determines that extra success instantiation (ps^) for the third ar- 
gument (SI) by improving RT(list(T),nelist (ground)) which is 

tz (list(X) , nelist(ground) — > [ti(T, ground) | tz(list(T), list(ground)] 
ti(list(T), list(ground) — > [] ; [ti(T, ground) | ii(list(T), list(ground)] 
ti(T, ground) -> $ground(T)$ 

replacing the last rule by r 4 and occurrences of ii(list(T), list(ground) by root{ri) = a 
obtaining 

ti(list(T), nelist(ground) — > [a \ ta(list(T), list(ground)] 
ti(list(T), list(ground) -> [] ; [a | ti(list(T), list(ground)] 

a — > $ipred$(ii(sign, ground), ii(sign, ground), new, ti(sign, ground)) 
ti(sign, ground) — > nec/ ; zero; pos 

Note that the mode information of the higher-order term has been preserved. The 
mode checking for the call to pop/3 will similarly preserve the higher-order mode 
information, and the original goal will be schedulable. □ 

The interaction between polymorphic mode analysis and higher-order constructs 
and calls is in fact slightly more complicated than discussed previously. This is 
because higher-order objects allow us to give arguments to a predicate in a piece- 
wise manner. This affects the execution of COLLECT_SET which was collecting the 
set M over all predicate arguments simultaneously. In order to handle these ac- 
curately we need to store the information from M found during the higher-order 
object construction, to be used in the higher-order call. That is, we need to store 
ground(v, M) and old(v, M) for each type parameter v appearing in the remaining 
arguments as part of the ti-grammar for the higher-order object. 

An alternative approach used by the HAL compiler is to update the success 
instantiations stored in the ti-grammar of the higher-order object based on the extra 
information from polymorphism. When the call to the higher-order polymorphic 
predicate is analyzed, the matching process also matches the success instantiations 
of the higher-order object to recover the previous matching information. 

9 Conclusions and Future Work 

The ultimate aim of mode checking is to ensure that the compiler has correct instan- 
tiation information at every program point in order to allow program optimization. 
It is reasonably straightforward (but laborious) to show that the mode checking 
defined in this paper ensures that the resulting program has input-output and call 
correctness. Some subtle points that arise are as follows. First, it is an invariant 
that any ti-grammar (or sub-grammar) r occurring in the mode checking process 
that contains rule root{r) — > ^var^ must be equivalent to Rt(£, old) for some t, 
which means that when variables are bound indirectly (through shared variables) 
the correctness of the ti-state is maintained. Second, if a procedure is input-output 
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correct for its declared type, then it is also input-output correct for any instance 
of the type. This follows from the limited possibilities for manipulating objects of 
variable type (essentially copying and testing equality). 

This means that compiler optimizations can safely be applied. The only mode er- 
ror that may be detected at run-time arises from situations explained in Section [5. 21 
and Example El 13 The compiler emits warnings when such a possibility exists. 

We have described for the first time mode checking for CLP languages, such 
as HAL, which have strong typing and re-orderable clause bodies, and described 
the algorithms currently used in the HAL compiler. The actual implementation of 
these algorithms in the HAL compiler is considerably more sophisticated than the 
simple presentation here. Partial schedules are computed and stored and accessed 
only when enough new instantiation information has been created to reassess them. 
Operations such as ^ are tabled and hence many operations are simply a lookup in a 
table. We have found mode checking is efficient enough for a practical compiler. For 
the compiler compiling itself (29000 lines of HAL code in 27 highly interdependent 
modules compiled in 15 mins 20 sees) mode checking requires 16.4% of overall 
compile time. While compiling the libraries (4600 lines of HAL code in 12 almost 
independent modules compiled in 47 sees) it takes 13.1% of overall compile time. 
And compiling a suite of small to medium size benchmarks (6200 lines of HAL code 
in 67 modules compiled in 183 sees) it takes 13.0% of overall compile time, 

There is considerable scope for future work. One aim is to strengthen mode 
checking. We plan to add tracking of aliasing and groundness dependencies. Another 
problem is that currently HAL (like Mercury) never undoes a feasible choice of 
ordering the literals. This can lead to correctly moded programs not being checkable 
as in Example 1271 In practice this behavior is rare, but we would like to explore 
more complete strategies. 

Example 27 

Consider the following declarations and goal: 

:- pred p(list (int) ,list (int) ) . 
:- mode p(out,out) is det . 

:- mode p(in(evenlist (ground) ), out (evenlist (ground) ) ) is det. 
:- pred q(list (int) ) . 

:- mode q(out (evenlist (ground) ) ) is det. 
:- pred r (list (int) ) . 

:- mode r(in(evenlist (ground) ) ) is det. 
?- p(L0, LI) , q(L0) , r(Ll) . 

The first two literals of the goal are schedulable in the order given, as p_model (LO , 
LI) , qjnodel (L2) , L2 = L0 but then r(Ll) is not schedulable (the list LI may 
not be of even length). There is a feasible schedule: qjnodel (L0) , p_mode2(L0, 

13 Note this does not invalidate the input-output or call correctness for the remainder of the 
program. 
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LI) , r_model(Ll) which is missed by both HAL and Mercury, since they don't 
undo the feasible schedule for the first two literals. In order to avoid this problem 
HAL allows the user to name modes of a predicate and hence specify exactly which 
mode is required. □ 

A second aim is to improve the efficiency of the reordered code, by, for instance, 
reducing the number of initializations. The final aim is to provide mode inference as 
well as mode checking — the ability to reorder body literals makes this a potentially 
very expensive process. 
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Appendix A Algorithms 

In this appendix we give full versions of the tree operations mentioned in the paper. The 
basic tree operations are relatively straightforward, but new kinds of nodes for solver 
variables, polymorphic types and higher-order terms complicate this somewhat. Recall 
that we assume we are dealing with type correct programs, hence the operations make use 
of this to avoid many redundant comparisons. For example when comparing the order of 
two ti-grammars, then if one is a predicate type, the other must be an identical predicate 
type. 

The ordering relation n < r^ on two ti-grammars is defined as the result of LT(ri , r2 , 0). 
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lt( P i, P 2, P) 

if (p 2 — T) return true 

if (pi = T) return false 

if ((root(pi),root(p2)) G P) return true 

if (p2 = new and pi / new) return false 

case: 

pi = new: return (p2 = new) 

root( Pl ) -> $old(u)$ G pi: %% pi = base(u, old) 

return root{p2) — » $old(u)$ G P2 
rooi(pi) — > $ground(u)$ G pi: %% pi = BASe(u, ground) 

return root{p2) — > $ground(u)$ G P2 
root(pi) — ► $gpred$ G pi: %% pi = BASE(pred(ii, . . . , ground) 

return root(p2) — » $gpred$ G P2 
root(pi) — ► $ipred$(£ci, tsi, . . . , tc n ,ts n ) G pi: %% non-base higher-order ti 

if (root(p2) — > $gpred$ G P2) return true 

let root(p2) — » $ipred$(te'i,ts'i, . . . , i<4,is^) G P2 

for t := l..n 

if (-iLT(sw6g(tc^,p2), subg{tci,p\), P U {(rooi(pi), rooi(p2))})) return false 
if (-iLT(su&<7(£sj,pi), subg(ts'i,p2), P U {(root(pi), roo£(p2))})) return false 

endfor 

return true 
default: 

foreach root(pi) — > /(xi, . . . , x n ) G pi 
if (3rooi(p 2 ) ™> /(rci, . . . ,x' n ) G p 2 ) 
for i := l..n 

if (-iLT(s«6g(a;i,pi), subg(x' i ,p2), P U {(root(pi), rooi(p 2 ))})) return false 
endfor 
else return false 
endfor 
return true 

The abstract conjunction operation n A r-2 on two ti-grammars is defined as the first 
element of the pair returned by CONj(n, r2, 0). 

CONj(pi,p 2 ,P) 

if (pi = T) return (T, _) 
if (p2 = T) return (T, _) 
if (p2 = new) return (pi, root(pi)) 

if (meet(root(pi),root(p2)) G P) return (0, meet(root(pi) , root(p2))) 
case: 

pi = new: return (p2, root{p2)) 

root(pi) -> $old(u)$ G pi: %% pi = base(«, old) 

return (p2,root(p2)) 
root(pi) — ► $ground(u)$ G pi: %% pi = BASe(u, ground) 

return (pi,rooi(pi)) 
root(pi) — > $gpred$ G pi: %% pi = BASE(pred(£i, . . . , t n ), ground) 

return (p2,root(p2)) 
root(p\) — ► $ipred$(£ci, tsi, . . . , tc n ,ts n ) G pi: %% non-base higher-order ti 
if (root{p2) — » $gpred$ G P2) return (pi,root(pi)) 
let root{p2) — » $ipred$(a:c'i, xsi, . . . , icj,, xs^) G P2 
for i := l..n 

(tci,xc") := DlSj(stife<;(a;Ci,pi), subg(xc' i ,p2), P) 
(tsi,xs'i') :— COm(subg{xs i ,pi),subg(xs' i ,p2),P) 
if (tei = T or tsi = T) return (T, _) 
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endfor 

p := {meet(root(pi),root(p2)) — > $ipred$(a;c' 1 ', xs", . . . , xc'^, xs'^)}VJ 

tci U • • • U tCn U tsi U • • • U ts„ 
return (p, meet(root{pi),root{p2))) 
default: 
p := 

foreach root(pi) — > /(xi, . . . , x n ) 6 pi 
if (3rooi(p 2 ) -> /(xi, . . . ,a4) € p 2 ) 
for i := l..n 

(p", x") := CONj(subg(xi,pi), subg(x'i,p2), P U {meet(root(pi) , roo£(p 2 ))})) 
if (p" = T) return (T,_) 
endfor 

p ■— meet{root(p 1 ),root{p 2 )) — > /(a", . . . ,x„) U p U p" U ■ ■ ■ U p„ 
endfor 

return (p, meet(root(pi), root{p2))) 

The abstract disjunction operation n V r 2 on two ti-grammars, is defined as the first 
element of the pair returned by Disj(ri, r 2 , 0). 

DISj(pi,p 2 ,P) 

if (pi = T) return (T, _) 
if (p 2 = T) return (T, _) 

if (pi = new and p 2 = new) return ({new — ► #fresh#}, new) 
if (p 2 = new) return (T,_) 

if (3join(root(pi),root(p2)) G P) return (®,join(root(pi), root(p2))) 
case: 

pi = new: return (T, _) 

root(p-i) -> $old(u)$ G pi: %% pi = base(w, old) 

return (pi,root(pi)) 
root(pi) — ► $ground(u)$ G pi: %% pi = BASe(u, ground) 

return (p2,root(p2)) 
root(pi) — > $gpred$ G pi: %% pi = BASE(precf(ti, . . . ,t n ), ground) 

return (pi,roof(pi)) 
root{p\) —> $ipred$(xci, xsi, . . . ,xc n ,xs n ) G pi:%% non-base higher-order ti 
if (root(p2) — > $gpred$ G p 2 ) return (p 2 ,root(p 2 )) 
let root(p2) — » $ipred$(a;c'i, xs'i, . . . , xcj,, xs^) G p 2 
for t := l..n 

(tei,xc") := CON J (sitfep (xci, pi ), subg(xc'i,p 2 ), P) 
(tsi,xs") :— Dis.l(subg(xsi,pi), subg(xs' i ,p2), P) 
if (tej = T or tsi = T) return (T, _) 
endfor 

p := {join(root(pi),root(p2)) — > $ipred$(xc'i' , xs" , . . . , xc",xs")}U 

ici U • • • U tc„, U tsi U • • • U ts„ 
return (p, join(root(pi) , root(p 2 ))) 
default: 
p := 

foreach root(pi) — > /(xi, . . . , x n ) G pi 
if (3root(p 2 ) -> /(xi, . . . ,x' n ) € p 2 ) 
for i := l..n 

(p",z") : = DiSj(sw6g(a;i,pi),SM65(x-,p 2 ),PU {jom(root(pi),root(p 2 ))})) 
if (p" = T) return (T,_) 
endfor 

p := {join(root( Pl ),root(p2)) f(x'{, . . . , x^)} U p U pi' U • • • U pi 
else 
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p := {join(root(p 1 ),root(p 2 ) — > f{xi,. . . , x n )} UpU 
subg(x 1 ,pi) U • • ■ U subg(x n ,p 1 ) 

endif 
endfor 

foreach root(j>2) — ► /(a;!, ■ • • , £„) G P2 

if (^3root(pi) -s- /(cci, . . . , x n ) G pi) 
p := {join(root(p 1 ),root(p 2 ) -> /(xi, . . . , x' n )} U pU 
subg(x' 1 ,p2) U • • • U subg(x' n ,p 2 ) 

endfor 

return (p, join(root(pi) , root(p2))) 

The rt operation constructs a ti-grammar from a type t and instantiation i and is 
denned as the first element in the pair resulting from RT"(i, i, 0). 

Kr(t,i,P) 

if (3ti(t,i) G P) return (0,ti(t,i))) 
case: 

i is a base instantiation: return BASE(t,i,P) 
i = pred(d — * si, . . . , c„ —> s„): 

if (t =fc pred(ti, . . . , t n )) return (T, _) 

let t be of the form pred(t\, . . . , t„) 

for j — l..n 

{tCj,XCj) := KT{tj, Cj, P) 
(tSj,XSj) ~ RT(tj, Sj, P) 

if (tej = T or tsj — T) return (T, _) 
endfor 

r := {ti(t,i) — > $ipred$(xci , xs\, . . . , xc n , xs n )}U 

tci U • • • U tc n U tsi U • • • U ts n 
return (r, ti(t, i)) 
default: 

if (t G Vtype) return (T,_) 
r := 

foreach x — ► f(xn, . . . , Xi n ) G rules(i) 
if (3a;' — ► f(xn, ■ ■ ■ , x tn ) G rules(t)) 
for j = 1..TZ 

(r,-,^) := RT(x t 3, ly, PU {*i(x t j,a;ij)}) 
if (rj = T) return (T,_) 
endfor 

r := -t/(n,...,i„)}UrUriU---Ur„ 
endif 
endfor 

return (r, ti(t, i)) 

BASE(t,base,P) 

if (base = new) return ({new — > #fresh#}, new) 
if (ti(t,base) G P) return (®,ti(t,base)) 
if (t G V tBpe ): 

if (base = ground) return ({£i(t, ground) — > $ground(i)$}, £i (£, ground)) 

else return old) -> $ground(«)$ ; Bold(v)$}, old)) 

else if (t is of the form pred(ti, . . . , t„)) 

return ({ti(pred(ti, . . . ,t n ), ground) — > $gpred$}, ti(pred(ti, . . . ,t n ), ground) ) 
else 

r := 

foreach x — > /(ti, . . . ,i„) in rules(t) 
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for j G l..n 

(rj,Xj) :— BASE(t j, base, P U {ti(t, base)}) 
endfor 

r := {ti(t, base) — » /(xi, . . . , a^n)} U r U n U • • • U r„ 
endfor 

if {base = old and t is a solver type) then r := {ti(t,base) — » #var#} U r 
return (r,U(t,base)) 



